Skip to content

Chapter 4: Convex Optimization Problems

4.1 Optimization Problem in Standard Form

An optimization problem in standard form is written as

minxf0(x)s.t.fi(x)0,i=1,,mhj(x)=0,j=1,,p

where:

  • xRn is the optimization variable.
  • f0:RnR is the objective (cost) function.
  • fi are inequality constraint functions.
  • hj are equality constraint functions.

Optimal Value

The optimal value p of the problem is defined as

p=inf{f0(x)fi(x)0,hj(x)=0}
  • p=: the problem is infeasible.
  • p=: the problem is unbounded below.

Feasible, Optimal, and Locally Optimal Points

  • x is feasible if it satisfies all constraints.

  • x is optimal if x is feasible and f0(x)=p.

  • x is locally optimal if x is optimal for (for any R>0)

    minzf0(z)s.t.fi(z)0,i=1,,mhi(z)=0,i=1,,pzx2R
  • If x is feasible and f0(x)pε, then x is called ε-suboptimal.

Examples (with n=1, m=p=0):

  • f0(x)=logx, domf0=R++: p=
  • f0(x)=xlogx, domf0=R++: p=1/e, x=1/e
  • f0(x)=x33x, domf0=R: p=, local optimum at x=1

Implicit Constraints and Domain

The implicit domain constraint is

xD=i=0mdomfij=1pdomhj
  • D is the domain of the problem.
  • Explicit constraints: fi(x)0, hj(x)=0.
  • The problem is unconstrained if m=p=0.

Example:

minxf0(x)=i=1klog(biaix)

is an unconstrained problem with implicit constraints aix<bi.

Feasibility Problem

A feasibility problem asks whether any feasible point exists:

findxs.t.fi(x)0,i=1,,mhj(x)=0,j=1,,p

This can be considered a special case of the general problem with f0(x)0:

  • p=0 if constraints are feasible (any feasible x is optimal).
  • p= if infeasible.

4.2 Convex Optimization Problem

A problem is a convex optimization problem if it is of the form

minxf0(x)s.t.fi(x)0,i=1,,maix=bi,i=1,,p

Definition. An optimization problem is convex if:

  1. f0 is convex.
  2. f1,,fm are convex.
  3. h1,,hp are affine: hi(x)=aixbi.

Key Property. The feasible set of a convex optimization problem is convex: it is the intersection of m sublevel sets (of convex functions) and p hyperplanes, which is convex.


4.3 Local and Global Optima

Theorem (Local = Global). For a convex optimization problem, every locally optimal point is also globally optimal.

Proof. Suppose x is locally optimal and there exists a feasible y with f0(y)<f0(x). For z=θy+(1θ)x with small θ>0, convexity gives

f0(z)θf0(y)+(1θ)f0(x)<f0(x)

But local optimality requires f0(x)f0(z) for all feasible z sufficiently near x — a contradiction.

Uniqueness. If f0 is strictly convex, then a convex optimization problem has at most one global minimizer (uniqueness).


4.4 First-Order Optimality Condition

For a convex problem

minxf(x)s.t.xC

Theorem (First-Order Optimality). A feasible point x is optimal if and only if

f(x),yx0yC

Geometric interpretation:

  • f(x) makes an acute (90) angle with all feasible directions at an optimal x.
  • f(x) defines a supporting hyperplane to the feasible set C at x.
  • f(x) is normal to the tangent plane of the feasible set.

Special Cases

Unconstrained optimization:

f0(x)=0

Equality-constrained minimization:

minxf0(x)s.t.Ax=bz:f0(x)+Az=0

Minimization over the nonnegative orthant:

minxf0(x)s.t.x0f0(x)0,xi(f0(x))i=0,i=1,,n

Example: Unconstrained Quadratic Minimization

Consider minimizing the quadratic function:

f(x)=12xQx+bx+c

where Q0. The first-order condition yields

f(x)=Qx+b=0
  • If Q0, there is a unique solution: x=Q1b.
  • If Q is singular and bR(Q), there is no solution (minxf(x)=).
  • If Q is singular and bR(Q), there are infinitely many solutions: x=Q+b+z, zN(Q), where Q+ is the pseudoinverse of Q.

4.5 Equivalent Transformations

Two problems are (informally) equivalent if their solutions can be readily converted into each other. Common transformations that preserve equivalence:

Transformations and Change of Variables

If h:RR is a monotone increasing transformation, then

minxCf(x)minxCh(f(x))

This can be used to reveal "hidden convexity" of a problem.

If φ:RnRm is one-to-one and its image covers the feasible set C, then we can change variables:

minxf(x)s.t.xCminyf(φ(y))s.t.φ(y)C

Example: Log-likelihood transformation. In maximum likelihood estimation (MLE), given independent samples x1,,xn with likelihood

L(θ)=i=1np(xiθ)

applying the strictly increasing log() transformation yields

(θ)=logL(θ)=i=1nlogp(xiθ)

with the key property: argmaxθL(θ)=argmaxθ(θ).

Eliminating Equality Constraints

Given:

minxf0(x)s.t.fi(x)0,i=1,,mAx=b

Any feasible point can be expressed as x=My+x0, where Ax0=b and R(M)=N(A). The problem is equivalent to

minyf0(My+x0)s.t.fi(My+x0)0,i=1,,m

Introducing Slack Variables

Given:

minxf0(x)s.t.fi(x)0,i=1,,mAx=b

inequality constraints can be transformed via slack variables si0:

minx,sf0(x)s.t.si0,i=1,,mfi(x)+si=0,i=1,,mAx=b

Note. This transformation is no longer convex unless fi are affine.

Epigraph Formulation

The standard form problem

minxf0(x)s.t.fi(x)0,i=1,,mAx=b

is equivalent to its epigraph form:

minx,tts.t.f0(x)t0,fi(x)0,i=1,,mAx=b

Relaxation

Given an optimization problem

minxf(x)s.t.xC

we can always take an enlarged constraint set C~C and consider

minxf(x)s.t.xC~

This is called a relaxation and its optimal value is always smaller or equal to that of the original problem (for minimization).

Example: Continuous relaxation of integer constraints.

minxcxs.t.Axb,x{0,1}nx[0,1]n
  • Feasible set becomes convex (a polytope).
  • The problem reduces to a linear program (LP).
  • Provides a lower bound on the integer optimum.

Example: Relaxing nonconvex sets. x2+y2=1 (unit circle) x2+y21 (unit disk). For non-affine equality constraints hj(x)=0 where hj are convex, replace with hj(x)0.

Converting a Nonconvex Problem to a Convex One

Via observation:

Nonconvex problem:

minx1,x2f0(x)=x12+x22s.t.f1(x)=x11+x220,h1(x)=(x1+x2)2=0

f0 is convex, but f1 is not convex and h1 is not affine. The equivalent convex formulation is:

minx1,x2x12+x22s.t.x10,x1+x2=0

Via substitution:

Nonconvex problem: minx(|x|1)2. Let t=|x| where t0. The equivalent convex problem is mint(t1)2 subject to t0.

Example: Geometric program (monomial case). The original nonconvex problem

minx0f0(x)=c0x1a01x2a02xna0ns.t.fi(x)=cix1ai1x2ai2xnain1,i=1,,m

where ci>0, becomes convex via the monotone transformation yi=logxi (xi=eyi):

minyRnlogc0+a01y1++a0nyns.t.logci+ai1y1++ainyn0,i=1,,m

4.6 Linear Programming (LP)

Definition (Linear Program). An LP minimizes a linear objective subject to linear inequality and equality constraints:

minxcxs.t.Dxd,Ax=b

An LP is a convex problem with affine objective and constraint functions. It is the most fundamental problem class in convex optimization.

Inequality and Standard Forms

Inequality form:

minxcxs.t.Axb

Standard form:

minxcxs.t.Ax=b,x0

Conversion between forms:

  • Replace aixbi with aix+si=bi, si0 (slack variables).
  • Replace a free variable xj with xj=xj+xj, where xj+,xj0.

Example: Diet Problem

Find the cheapest combination of foods that satisfies certain nutritional requirements:

minxcxs.t.Dxdx0

where cj is the per-unit cost of food j, di is the minimum required intake of nutrient i, Dij is the content of nutrient i per unit of food j, and xj is the units of food j in the diet.

Example: Piecewise-linear Minimization

Consider minimizing the piecewise-linear function:

f(x)=maxi=1,,m(aix+bi)

This can be transformed to an equivalent LP via its epigraph form:

mint,xts.t.aix+bit,i=1,,m

This is an LP (in inequality form) with variables x and t.

Example: Chebyshev Center of a Polyhedron

Given a polyhedron P={xRn:Axb}, find the largest Euclidean ball that fits inside it:

B(xc,r)={xRn:xxc2r}={xc+u:u2r}

The problem is formulated as

maxxc,rrs.t.aixc+ai2rbi,i=1,,m
  • ai2 is the Euclidean norm of the normal vector of the i-th face.
  • Each constraint ensures the ball does not cross the corresponding face.
  • This is a linear program (LP) in xc and r.
  • Applications: emergency facility placement, sensor/Wi-Fi placement, urban planning.

Example: Basis Pursuit

Consider an underdetermined linear system Ax=b, ARm×n, m<n. The goal is to find the sparsest solution (minimize the number of nonzero entries).

  • Exact sparsity minimization (0-norm) is NP-hard.
  • Relaxation: minimize the 1-norm instead: minxx1 s.t. Ax=b.
  • The 1-norm encourages sparsity while remaining convex.

By introducing nonnegative variables ui0 to represent |xi| (uixiui), the LP formulation is

minx,ui=1nuis.t.Ax=b,uixiui,i=1,,n,ui0,i=1,,n

4.7 Quadratic Programming (QP)

Definition (Quadratic Program). A QP minimizes a quadratic objective subject to linear constraints:

minx12xQx+cxs.t.Dxd,Ax=b

If Q0 (positive semidefinite), the problem is convex. Convex QPs have globally optimal solutions and can be solved efficiently.

Example: Least Squares

Least squares problem:

minxAxb22

Least squares with bounds:

minxAxb22s.t.xu

QP formulation:

minx12x(2AA)x(2Ab)x=12xQx+cx

where Q=2AA0, c=2Ab.

Example: Distance Between Two Polyhedra

Given polyhedra P1={xRn:A1xb1} and P2={yRn:A2yb2}, the distance between them is

minx,yxy22s.t.A1xb1,A2yb2

This is a convex QP.

Example: Linear Program with Random Cost

Let cost be cx where c is random with mean c¯=E[c]. A risk-averse objective is

minxc¯x+λVar(cx)

Expanding the variance:

Var(cx)=E[(cxc¯x)2]=xE[(cc¯)(cc¯)]x=xΣx

The QP formulation is

minxc¯x+λxΣxs.t.Dxd,x0

where Σ is the covariance matrix of c.

Example: LASSO (Basis Pursuit with Noise)

LASSO trades off data fidelity and sparsity.

Penalty formulation:

minx12Axb22+λx1

Constrained formulation:

minxAxb22s.t.x1k

Key characteristics:

  • Handles noisy data (LASSO BP when noise-free).
  • Sparsity controlled by λ (or k).
  • Can be written as a QP.

4.8 Quadratically Constrained Quadratic Programming (QCQP)

Definition (QCQP). A QCQP minimizes a quadratic objective subject to quadratic inequality and linear equality constraints:

minx12xP0x+q0x+r0s.t.12xPix+qix+ri0,i=1,,mAx=b

If P0,P1,,Pm0, the problem is convex.

QCQP generalizes QP by allowing quadratic inequality constraints, and in turn is a special case of SOCP.


4.9 Second-Order Cone Programming (SOCP)

Definition (SOCP). An SOCP is of the form:

minxcxs.t.Dix+di2eix+fi,i=1,,mAx=b

Second-Order (Lorentz) Cone

The second-order cone (also called the Lorentz cone or ice-cream cone) is

Q={(x,t)Rn+1:x2t}

The constraint Dix+di2eix+fi is equivalent to (Dix+di,eix+fi)Q.

Why is Q convex? It is the epigraph of the Euclidean norm x2, which is a convex function.

QP SOCP

A QP can be reformulated as an SOCP:

minx12xQx+cxs.t.Dxd,Ax=bminx,tcx+12ts.t.Dxd,xQxt,Ax=b

with xQxtQ1/2x2214(1+t)214(1t)2, which is an SOC constraint.

Example: Robust Linear Program (Deterministic)

In LP, there may be uncertainty in the data c, ai, or bi. With uncertainty in ai, two approaches:

Deterministic (worst-case guarantee): constraints must hold for all ai in an ellipsoidal uncertainty set

Ei={a¯i+Piu:u21},a¯iRn,PiRn×n

Robust LP:

minxcxs.t.aixbiaiEi,i=1,,m

Equivalent SOCP:

minxcxs.t.a¯ix+Pix2bi,i=1,,m

since supu21(a¯i+Piu)x=a¯ix+Pix2.

Example: Robust Linear Program (Stochastic)

Assume aiN(a¯i,Σi) is Gaussian. Then aixN(a¯ix,xΣix), and

Pr(aixbi)=Φ(bia¯ixΣi1/2x2)

where Φ is the CDF of N(0,1). The robust LP with probability constraint

minxcxs.t.Pr(aixbi)η,i=1,,m

is equivalent to the SOCP (for η12):

minxcxs.t.a¯ix+Φ1(η)Σi1/2x2bi,i=1,,m

The terms Pix2 and Φ1(η)Σi1/2x2 are interpreted as the budget margin for robustness.


4.10 Semidefinite Programming (SDP)

Definition (SDP). A semidefinite program is of the form:

minxcxs.t.x1F1+x2F2++xnFn+F00Ax=b

where FiSk (symmetric k×k matrices). The inequality constraint x1F1++xnFn+F00 is called a linear matrix inequality (LMI).

LP SDP

An LP can be expressed as an SDP by using a diagonal matrix:

minxcxs.t.Axbminxcxs.t.diag(Axb)0

SOCP SDP

The SOC constraint x2t is equivalent to the LMI

x2t[tIxxt]0

This follows from the Schur Complement Theorem: for symmetric A, C with C0,

[ABBC]0ABC1B0

Thus an SOCP can be rewritten as an SDP:

minxcxs.t.Dix+di2eix+fiminxcxs.t.[(eix+fi)IDix+di(Dix+di)eix+fi]0

Example: Eigenvalue Minimization

minxλmax(A(x))

where A(x)=A0+x1A1++xnAn (with given AiSk). Equivalent SDP:

minx,tts.t.A(x)tI

since λmax(A)tAtI.


4.11 Hierarchy of Problem Classes

The canonical convex optimization problem classes form a nested hierarchy:

LPQPQCQPSOCPSDPConic Program

Each class is a special case of the next, with increasing modeling expressiveness and computational cost.

ClassObjectiveConstraintsKey Structure
LPLinear cxLinear inequalities AxbPolyhedral feasible set
QPQuadratic 12xQx+cx, Q0Linear inequalitiesLP with quadratic objective
QCQPQuadratic, P00Quadratic inequalities Pi0Quadratic in both objective and constraints
SOCPLinear|Dix+di|2eix+fiSecond-order cone constraints
SDPLinearLinear matrix inequality xiFi+F00Semidefinite constraints

Conic Program

At the top of the hierarchy is the conic program:

Definition (Conic Program).

minxcxs.t.A(x)KAx=b

where A:RnX is an affine map and K is a convex cone.

All the previous classes are special cases of conic programs, depending on the choice of cone K:

ProblemCone K
LPK=R+n={xRn:x1,,xn0} (nonnegative orthant)
SOCPK=Q={(x,t)Rn+1:|x|2t} (second-order cone)
SDPK=S+n={XSn:X0} (positive semidefinite cone)

4.12 Comparison: LP vs. QP in Practice

AspectLPQP
Objectivemincxmin12xQx+cx
ConstraintsDxd, Ax=bDxd, Ax=b
  • LP QP (QP recovers LP when Q=0).
  • Often, solving an LP in theory amounts to solving a QP in practice.

Diet problem:

  • In theory: cost c is deterministic LP.
  • In practice: c is a random vector QP.

Basis pursuit:

  • In theory: samples are noise-free (Ax=b) LP.
  • In practice: samples are noisy (Axb) QP (LASSO).

Deterministic vs. stochastic robustness:

  • In theory: deterministic LP LP.
  • In practice: robust LP under uncertainty SOCP.

4.13 Practical Methods for Establishing Convexity

Establishing Convexity of a Set C

  1. Apply the definition: x1,x2C, 0θ1θx1+(1θ)x2C.
  2. Show C is obtained from simple convex sets by operations that preserve convexity:
    • Intersection
    • Affine function
    • Perspective function
    • Linear-fractional function
  3. Show C is a sublevel set of a convex function.
  4. Show C is the epigraph of a convex function.

Establishing Convexity of a Function f

  1. Apply the definition: f(θx+(1θ)y)θf(x)+(1θ)f(y).
  2. Use equivalent conditions (e.g., 2f(x)0).
  3. Show f is obtained from simple convex functions by operations that preserve convexity:
    • Nonnegative weighted sum: g(x)=i=1mαifi(x), αi0.
    • Composition with affine function: g(x)=f(Ax+b).
    • Pointwise maximum: g(x)=max{f1(x),,fm(x)}.
    • Pointwise supremum.
    • Composition with scalar functions.
    • Minimization.
    • Perspective.

4.14 Summary of Key Theorems

  1. The feasible set of a convex optimization problem is convex.
  2. Every local optimum of a convex problem is a global optimum.
  3. For strictly convex f0, the optimal solution (if it exists) is unique.
  4. First-order optimality condition: x is optimal for minxCf(x) iff f(x),yx0 for all yC.
  5. Problems can be made equivalent through monotone transformations, change of variables, slack variables, epigraph formulation, and relaxation.
  6. The hierarchy LP QP QCQP SOCP SDP Conic Program provides progressively richer modeling frameworks, each with convex structure guaranteeing global optimality.