Chapter 2: Convex Sets

2.1 Mathematical Background

2.1.1 Inner Product of Vectors

Definition (Inner Product). For vectors $x, y \in R^{n}$ , the inner product (dot product) is
$⟨ x, y ⟩ = x^{⊤} y = \sum_{i = 1}^{n} x_{i} y_{i}$

Geometric Interpretation: The inner product measures the alignment between two vectors:

$θ = 0^{\circ} \Rightarrow$ vectors are in the same direction.
$θ = 90^{\circ} \Rightarrow$ vectors are orthogonal (perpendicular).

Example:

x = [\begin{matrix} 1 \\ 2 \end{matrix}], y = [\begin{matrix} 3 \\ 1 \end{matrix}] \Rightarrow ⟨ x, y ⟩ = 1 \cdot 3 + 2 \cdot 1 = 5

2.1.2 Angle Between Vectors

The angle $θ$ between $x$ and $y$ is defined by:

\cos θ = \frac{⟨ x, y ⟩}{∥ x ∥ ∥ y ∥}, ∥ x ∥ = \sqrt{⟨ x, x ⟩}

Example:

x = [\begin{matrix} 1 \\ 2 \end{matrix}], y = [\begin{matrix} 3 \\ 1 \end{matrix}] \Rightarrow θ = \cos^{- 1} (\frac{5}{\sqrt{5} \sqrt{10}})

2.1.3 Inner Product of Matrices

Definition (Frobenius Inner Product). For matrices $A, B \in R^{m \times n}$ , the Frobenius inner product is
$⟨ A, B ⟩_{F} = \sum_{i = 1}^{m} \sum_{j = 1}^{n} A_{i j} B_{i j} = trace (A^{⊤} B)$

Frobenius Norm:

∥ A ∥_{F} = \sqrt{⟨ A, A ⟩_{F}} = \sqrt{\sum_{i, j} A_{i j}^{2}}

Angle Between Matrices:

\cos θ = \frac{⟨ A, B ⟩_{F}}{∥ A ∥_{F} ∥ B ∥_{F}}

Example:

A = [\begin{matrix} 1 & 2 \\ 0 & 3 \end{matrix}], B = [\begin{matrix} 2 & 1 \\ 1 & 0 \end{matrix}] \Rightarrow ⟨ A, B ⟩_{F} = 1 \cdot 2 + 2 \cdot 1 + 0 \cdot 1 + 3 \cdot 0 = 4

2.1.4 Common Vector Norms

Euclidean Norm ( $ℓ_{2}$ norm):
$∥ x ∥_{2} = \sqrt{\sum_{i = 1}^{n} x_{i}^{2}}$
1-Norm ( $ℓ_{1}$ norm):
$∥ x ∥_{1} = \sum_{i = 1}^{n} | x_{i} |$
Chebyshev Norm ( $ℓ_{\infty}$ norm):
$∥ x ∥_{\infty} = max_{i} | x_{i} |$
$p$ -Norm (general $ℓ_{p}$ norm, $p \geq 1$ ):
$∥ x ∥_{p} = {(\sum_{i = 1}^{n} | x_{i} |^{p})}^{1 / p}$

2.1.5 Symmetric Matrices

Definition. A square matrix $A \in R^{n \times n}$ is symmetric if
$A = A^{⊤} or equivalently A_{i j} = A_{j i} for all i, j$

Key Properties:

All eigenvalues are real.
Eigenvectors corresponding to distinct eigenvalues are orthogonal.
Symmetric matrices are diagonalizable by an orthogonal matrix.

2.1.6 Eigenvalue Decomposition

Theorem (Eigenvalue Decomposition). If $A$ is symmetric, it admits
$A = Q Λ Q^{⊤} = \sum_{i = 1}^{n} λ_{i} q_{i} q_{i}^{⊤}$
where:
$Q = [q_{1} \dots q_{n}]$ is orthonormal: $Q^{⊤} Q = I$
$Λ = diag (λ_{1}, \dots, λ_{n})$ with $λ_{i} \in R$
Each term $λ_{i} q_{i} q_{i}^{⊤}$ is a rank-1 symmetric matrix.

2.1.7 Positive Semidefinite (PSD) Matrices

Definition. A symmetric matrix $A \in R^{n \times n}$ is positive semidefinite (PSD) if
$x^{⊤} A x \geq 0 \forall x \in R^{n}$
(denoted as $A ⪰ 0$ )

Key Properties:

All eigenvalues of $A$ are nonnegative: $λ_{i} \geq 0$ .
PSD matrices are symmetric, hence diagonalizable by orthogonal matrices.
If $A ⪰ 0$ , then $A = B B^{⊤}$ for some $B \in R^{n \times n}$ (Cholesky factorization or matrix square root).
The set of PSD matrices forms a convex cone.

2.1.8 Gradient and Hessian of Multivariate Functions

Gradient: For $f : R^{n} \to R$ , the gradient (direction of steepest increase) is

\nabla f (x) = [\begin{matrix} \frac{\partial f}{\partial x_{1}} \\ ⋮ \\ \frac{\partial f}{\partial x_{n}} \end{matrix}] \in R^{n}

Hessian: The Hessian (local curvature) is the matrix of second-order partial derivatives:

\nabla^{2} f (x) = [\begin{matrix} \frac{\partial^{2} f}{\partial x_{1}^{2}} & \dots & \frac{\partial^{2} f}{\partial x_{1} \partial x_{n}} \\ ⋮ & ⋱ & ⋮ \\ \frac{\partial^{2} f}{\partial x_{n} \partial x_{1}} & \dots & \frac{\partial^{2} f}{\partial x_{n}^{2}} \end{matrix}] \in R^{n \times n}

Quadratic Example: Let $f (x) = x^{⊤} A x + b^{⊤} x + c$ , with $A \in R^{n \times n}$ symmetric, $b \in R^{n}$ , $c \in R$ . Then

\nabla f (x) = 2 A x + b, \nabla^{2} f (x) = 2 A

2.2 Lines, Line Segments, and Rays

Let $x, y \in R^{n}$ and define the direction vector $d := y - x$ .

Line through $x$ and $y$ :
$ℓ_{x, y} = {x + θ d ∣ θ \in R}$

Line segment from $x$ to $y$ :
$[x, y] = {x + θ d ∣ θ \in [0, 1]} = {(1 - θ) x + θ y ∣ θ \in [0, 1]}$

Ray starting at $x$ going through $y$ :
$\vec{x y} = {x + θ d ∣ θ \geq 0}$

2.3 Definition of Convex Sets

Definition (Convex Set). A set $C \subseteq R^{n}$ is convex if for any $x, y \in C$ and any $θ \in [0, 1]$ ,
$z = (1 - θ) x + θ y \in C$
$z$ is called a convex combination of $x$ and $y$ .

Intuition: Every line segment between two points in $C$ lies entirely inside $C$ .

Examples of convex sets: line, line segment, ray, single point, empty set, subspace, affine space, hyperplane, halfspace, convex cone, positive semidefinite cone, Euclidean balls and ellipsoids, norm balls and norm cones, polyhedra.

Why convex sets are important: In convex optimization, the feasible region of a convex optimization problem is a convex set. This property guarantees that any local optimum is also a global optimum.

2.4 Subspaces and Affine Sets

2.4.1 Subspace

Definition (Subspace). A subset $W \subseteq R^{n}$ is a subspace if:
$0 \in W$ (contains the zero vector)
$x, y \in W \Rightarrow x + y \in W$ (closed under addition)
$x \in W, α \in R \Rightarrow α x \in W$ (closed under scaling)

Examples:

$span {v_{1}, \dots, v_{k}} = {\sum_{i = 1}^{k} α_{i} v_{i} | α_{i} \in R}$ for $v_{1}, \dots, v_{k} \in R^{n}$
$null (M) = {x \in R^{n} ∣ M x = 0}$ for $M \in R^{m \times n}$
Any line through the origin in $R^{n}$

2.4.2 Affine Space

Definition (Affine Space). A subset $A \subseteq R^{n}$ is an affine space if for any $x, y \in A$ and any $θ \in R$ ,
$(1 - θ) x + θ y \in A$

Intuition: Affine spaces are like subspaces, but they need not pass through the origin. They are "shifted subspaces."

Examples:

Any line in $R^{2}$ (not necessarily through the origin)
Any plane in $R^{3}$ (not necessarily through the origin)
More generally: $A = x_{0} + W$ , where $x_{0} \in R^{n}$ and $W$ is a subspace

2.4.3 Affine Space vs. Hyperplane

A hyperplane in $R^{n}$ has dimension $n - 1$ .
Affine spaces can have any dimension $0, 1, \dots, n$ .

Examples:

A line in $R^{3}$ : $A = {(θ, 1, 0) ∣ θ \in R}$ , $\dim (A) = 1$ . Not a hyperplane (hyperplanes in $R^{3}$ have dimension 2).
A plane in $R^{4}$ : $A = {x_{0} + θ_{1} v_{1} + θ_{2} v_{2} ∣ θ_{1}, θ_{2} \in R}$ , $\dim (A) = 2$ . Not a hyperplane (hyperplanes in $R^{4}$ have dimension 3).

2.5 Hyperplanes and Halfspaces

Definition (Hyperplane). A hyperplane in $R^{n}$ is a set of the form
$H = {x \in R^{n} ∣ a^{⊤} x = b}$
where $a \in R^{n} ∖ {0}$ and $b \in R$ .

Intuition:

$a$ is a normal vector to the hyperplane.
$b$ shifts the hyperplane; if $b = 0$ , $H$ is a subspace.
Hyperplanes have dimension $n - 1$ .

Examples:

In $R^{2}$ : a line (e.g., $x_{1} + 2 x_{2} = 3$ )
In $R^{3}$ : a plane (e.g., $2 x_{1} - x_{2} + x_{3} = 5$ )

Definition (Halfspace). A closed halfspace is defined by
$H_{a, b}^{-} = {x \in R^{n} ∣ a^{⊤} x \leq b}, a \neq 0$
It contains all points on one side of the hyperplane $H$ .

A hyperplane divides $R^{n}$ into two halfspaces.

2.6 Convex Cones and Polar Cones

Definition (Cone). A set $K \subseteq R^{n}$ is a cone if
$x \in K, α \geq 0 \Rightarrow α x \in K$

Definition (Convex Cone). A set $K \subseteq R^{n}$ is a convex cone if it is a cone and is convex, equivalently:
$x, y \in K, α, β \geq 0 \Rightarrow α x + β y \in K$
$α x + β y$ is called a conic combination of $x, y$ with $α, β \geq 0$ .

Example: $K = {(x_{1}, x_{2}) \in R^{2} ∣ x_{1} \geq 0, x_{2} \geq 0}$ (the first quadrant).

Definition (Polar Cone). The polar cone of $K$ is
$K^{\circ} = {y \in R^{n} ∣ y^{⊤} x \leq 0 \forall x \in K}$
It contains all vectors forming a nonpositive inner product with all elements of $K$ .

Example: The polar of the first quadrant ( $R_{+}^{2}$ ) is the third quadrant ( $R_{-}^{2}$ ).

Properties of Polar Cones:
$K^{\circ}$ is always a closed, convex cone, even if $K$ is not convex.
If $K$ is closed and convex, $(K^{\circ})^{\circ} = K$ (bipolar property).
$K_{1} \subseteq K_{2} \Rightarrow K_{2}^{\circ} \subseteq K_{1}^{\circ}$ .

Examples:

If $K = R_{+}^{n}$ , then $K^{\circ} = R_{-}^{n}$ .
If $K$ is a subspace $L$ , then $K^{\circ} = L^{⊥}$ (the orthogonal complement).

Types of Combinations

Type	Form	Coefficient constraints	Resulting set
Linear combination	$\sum_{i = 1}^{k} θ_{i} x_{i}$	$θ_{i} \in R$	Subspace ( $span {x_{i}}$ )
Conic combination	$\sum_{i = 1}^{k} θ_{i} x_{i}$	$θ_{i} \geq 0$	Convex cone
Affine combination	$\sum_{i = 1}^{k} θ_{i} x_{i}$	$\sum_{i = 1}^{k} θ_{i} = 1$	Affine space
Convex combination	$\sum_{i = 1}^{k} θ_{i} x_{i}$	$θ_{i} \geq 0, \sum_{i = 1}^{k} θ_{i} = 1$	Convex set

2.7 Tangent Cone and Normal Cone

Let $C \subseteq R^{n}$ be a set and $x \in C$ .

Definition (Tangent Cone). The tangent cone of $C$ at $x$ is
$T_{C} (x) = {d \in R^{n} ∣ \exists t_{k} ↓ 0, d_{k} \to d with x + t_{k} d_{k} \in C}$

Represents directions in which you can "leave $x$ while staying in $C$ infinitesimally."
Describes feasible directions from $x$ .

Definition (Normal Cone). The normal cone of $C$ at $x$ is
$N_{C} (x) = {y \in R^{n} ∣ y^{⊤} d \leq 0 \forall d \in T_{C} (x)}$

Contains vectors/directions perpendicular (obtuse) to the tangent cone at $x$ .
Often used in optimality conditions and convex analysis.

2.8 Separating Hyperplane Theorem

Theorem (Separating Hyperplane). Let $C, D \subseteq R^{n}$ be two nonempty, disjoint, convex sets. Then there exists a nonzero vector $a \in R^{n}$ and a scalar $b \in R$ such that
$a^{⊤} x \leq b \leq a^{⊤} y for all x \in C, y \in D$
This means there exists a hyperplane ${x ∣ a^{⊤} x = b}$ that separates $C$ and $D$ (i.e., $C \subseteq H_{a, b}^{-}$ , $D \subseteq H_{a, b}^{+}$ ).

Simple interpretation: There is a hyperplane that can separate any point outside a convex set from the set itself.

Examples:

The nonnegative orthant in $R^{n}$ : $R_{+}^{n} = {x \in R^{n} ∣ x_{i} \geq 0, i = 1, \dots, n}$
- Separating hyperplane: $a = e_{1}$ , $b = 0$
Positive semidefinite matrices $S_{+}^{n}$ :
- Separating hyperplane: $a = q_{1} q_{1}^{⊤}$ , $b = 0$

Note: Convexity is required; non-convex sets may not be separable.

2.9 Generalized Inequalities

Let $K \subseteq R^{n}$ be a proper cone (a convex cone that is closed, pointed, and has nonempty interior).

Definition (Generalized Inequality). For $x, y \in R^{n}$ :
$x ⪯_{K} y ⟺ y - x \in K$

Strict version:
$x ≺_{K} y ⟺ y - x \in int (K)$

Examples:

$K = R_{+}^{n}$ : $x ⪯_{K} y ⟺ x_{i} \leq y_{i}$ for all $i$ (componentwise inequality).
$K = S_{+}^{n}$ (PSD cone): $X ⪯_{K} Y ⟺ Y - X ⪰ 0$ .

Note: We can have $x ⋠_{K} y$ and $y ⋠_{K} x$ at the same time (incomparable). Generalized inequalities are partial orders in general.

2.10 Minimum and Minimal Elements

Let $X \subseteq R^{n}$ and $K$ be a proper cone.

Definition (Minimum Element). $x^{*} \in X$ is the minimum if
$x^{*} ⪯_{K} x \forall x \in X$
It is the smallest element; unique if it exists. Equivalently, $X \subseteq x^{*} + K$ .

Definition (Minimal Element). $x \in X$ is a minimal if there is no $y \in X$ such that
$y ⪯_{K} x and y \neq x$
There may be multiple minimal elements. Equivalently, $(x - K) \cap X = {x}$ .

2.11 Dual Cone

Definition (Dual Cone). Let $K \subseteq R^{n}$ be a cone. The dual cone of $K$ is defined as
$K^{*} = {y \in R^{n} ∣ y^{⊤} x \geq 0 \forall x \in K}$

Properties:
$K^{*}$ is always a closed, convex cone.
If $K$ is closed and convex, $(K^{*})^{*} = K$ (bipolar property).
$K_{1} \subseteq K_{2} \Rightarrow K_{2}^{*} \subseteq K_{1}^{*}$ .
Relation to polar cone: $K^{*} = - K^{\circ}$ .

Examples:

$K = R_{+}^{n} \Rightarrow K^{*} = R_{+}^{n}$ (self-dual).
$K = S_{+}^{n}$ (PSD cone) $\Rightarrow K^{*} = S_{+}^{n}$ (self-dual).

2.12 Positive Semidefinite (PSD) Cone

$S^{n} = {X \in R^{n \times n} ∣ X^{⊤} = X}$ is the set of $n \times n$ symmetric matrices.
$S_{+}^{n} = {X \in S^{n} ∣ X ⪰ 0}$ is the positive semidefinite cone.
- $z^{⊤} X z \geq 0$ for all $z \in R^{n}$ .
- Equivalently, all eigenvalues $\geq 0$ .
$S_{+ +}^{n} = {X \in S^{n} ∣ X ≻ 0}$ is the positive definite cone.

2.13 Operations That Preserve Convexity

Practical methods for establishing convexity of a set $C$ :

Apply the definition of convexity:
$x_{1}, x_{2} \in C, 0 \leq θ \leq 1 \Rightarrow θ x_{1} + (1 - θ) x_{2} \in C$
Show that $C$ is obtained from simple convex sets (hyperplanes, halfspaces, norm balls, etc.) by operations that preserve convexity:
- Intersection
- Affine function
- Perspective function
- Linear-fractional function

2.13.1 Intersection of Convex Sets

Property. The intersection of any (finite or infinite) collection of convex sets is convex:
$C = ⋂_{i \in I} C_{i} is convex if each C_{i} is convex.$

Example:

C_{1} = {x ∣ x_{1} \geq 0}, C_{2} = {x ∣ x_{2} \geq 0} \Rightarrow C_{1} \cap C_{2} = R_{+}^{2}

Interpretation: Adding constraints (intersections) preserves convexity.

2.13.2 Affine Mapping

Property. If $C$ is convex and $A$ is a matrix, $b$ a vector, then the image under an affine map is convex:
${y ∣ y = A x + b, x \in C} is convex.$

Example: Scaling and translating a convex set preserves convexity:

C = {x ∣ ∥ x ∥_{2} \leq 1} \Rightarrow A C + b = {y ∣ y = A x + b, ∥ x ∥_{2} \leq 1} is convex.

Interpretation: Linear transformations and translations do not "bend" the set — they preserve convexity.

Chapter 2: Convex Sets ​

2.1 Mathematical Background ​

2.1.1 Inner Product of Vectors ​

2.1.2 Angle Between Vectors ​

2.1.3 Inner Product of Matrices ​

2.1.4 Common Vector Norms ​

2.1.5 Symmetric Matrices ​

2.1.6 Eigenvalue Decomposition ​

2.1.7 Positive Semidefinite (PSD) Matrices ​

2.1.8 Gradient and Hessian of Multivariate Functions ​

2.2 Lines, Line Segments, and Rays ​

2.3 Definition of Convex Sets ​

2.4 Subspaces and Affine Sets ​

2.4.1 Subspace ​

2.4.2 Affine Space ​

2.4.3 Affine Space vs. Hyperplane ​

2.5 Hyperplanes and Halfspaces ​

2.6 Convex Cones and Polar Cones ​

Types of Combinations ​

2.7 Tangent Cone and Normal Cone ​

2.8 Separating Hyperplane Theorem ​

2.9 Generalized Inequalities ​

2.10 Minimum and Minimal Elements ​

2.11 Dual Cone ​

2.12 Positive Semidefinite (PSD) Cone ​

2.13 Operations That Preserve Convexity ​

2.13.1 Intersection of Convex Sets ​

2.13.2 Affine Mapping ​

Chapter 2: Convex Sets

2.1 Mathematical Background

2.1.1 Inner Product of Vectors

2.1.2 Angle Between Vectors

2.1.3 Inner Product of Matrices

2.1.4 Common Vector Norms

2.1.5 Symmetric Matrices

2.1.6 Eigenvalue Decomposition

2.1.7 Positive Semidefinite (PSD) Matrices

2.1.8 Gradient and Hessian of Multivariate Functions

2.2 Lines, Line Segments, and Rays

2.3 Definition of Convex Sets

2.4 Subspaces and Affine Sets

2.4.1 Subspace

2.4.2 Affine Space

2.4.3 Affine Space vs. Hyperplane

2.5 Hyperplanes and Halfspaces

2.6 Convex Cones and Polar Cones

Types of Combinations

2.7 Tangent Cone and Normal Cone

2.8 Separating Hyperplane Theorem

2.9 Generalized Inequalities

2.10 Minimum and Minimal Elements

2.11 Dual Cone

2.12 Positive Semidefinite (PSD) Cone

2.13 Operations That Preserve Convexity

2.13.1 Intersection of Convex Sets

2.13.2 Affine Mapping