Center of Gravity Method
Todo Items
Add plots and diagrams illustrating proof ideas (as toy examples).
In this article, I will give a brief guide on what the center of gravity method (COG) is. Again, a good reference would be the note by Sébastien Bubeck1.
We will use the same settings and borrow parts of our exposition from the ellipsoid method2, where the convergence analysis follows along the same line as in that note’s section 1.3.
1. Setting the Stage
1.1. First Comparison with the Ellipsoid Algorithm
A high level description of the ellipsoid algorithm is that total control of a part of the feasibility region - the “onion peels” - are gained within each iteration via slicing along suitable directions (either it being a direction for a separating hyperplanes, or a direction given by subgradients), and by volume and shrinking arguments, as the uncontrolled feasibility region shrinks, the optimal value on regions on the peels, would not be too far from that from the uncontrolled region.
The “ellipsoidal” parts comes in the part where we want to gain control to a next peel, which uses ellipsoids as proxies. In center of gravity method, as one can guess it - we use center of gravities.
1.2. The Algorithm
Let us use again
to denote the “peel” and the “remaining onion” at time
.
Initiallize:
,
,
to be any point in
.
Halving:
Call COG oracle (here we think of it as a zeroth order oracle) to obtain the COG:
Call subgradient oracle obtain
. If
, optimality condition reached, terminate. Otherwise continue.
Update:
; which has the property:
the argmin of
.
One has the following:
COG Method: Convergence
We have the following bound:
Some remarks:
- The step “Compute center of gravity” is a highly nontrivial step in general if one wants exact solutions (e.g., think of the problem of computing the COG of a simplex with
vertices). In such sense, this algorithm is more of theoretical importance then of practical use.
- The volume inequality during the passage from
to
will be shown below - it is related to the so-called Grünbaum Inequality, which can be proved by also another famous inequality called Brunn-Minkowski Inequality.
- The analytical complexity to guarantee an accuracy of
is
. Compared to ellipsoid method’s
, both having linear rate of convergence, it is less sensitive to the underlying dimension
.
2. Math Behind the COG Method
The heart of the COG method lies within Brunn-Minkowski’s inequality and Grünbaum’s inequality. The latter inequality ensures that the peel we gain control of in each step of the COG method is large enough, while the first is an inequality relating two convex sets with their convex combination.
For the casual readers, you have already understand what the COG method does🎉!
For the more serious folks here, let us now dive into the beautiful maths.
2.1. Additional Notations
We introduce the following notations:
- Given
and
, we write:
which is just a dirty notation for constructing cones, convex combinations. Some times we do not distinguish between elements and one-point sets.
- We write
, the volume of the unit ball (in
)
- Given a measurable subset
, we write
, which satisfies:
- Given
with
, we define
as the set of elements
with
. We define
analogously.
- Given
,
, we can consider the slice:
2.2. Brunn-Minkowski Inequality
A good reference would be this one by R. J. Gardner’s review3. The Brunn-Minkowski inequality in the case of convex bodies can be stated concisely as follows4:
Brunn-Minkowski Inequality
Fixed convex bodies
.
- Additive. The function
is concave on
.
- Multiplicative. The function
is log-concave on
.
We will show Item 1. Item 2 follows from Item 15, by the fact that
and
differ by a constant, and that
,
are concave on places we are interested. To show Item 1, it suffices to show the following:
By definition of Lebesgue measures, it suffices to show the case where
are finite union of sets of the form
(boxes). Write
as the least number of boxes needed to write
as union of boxes, and assume
. We prove by induction on
:
- Base Case. If
, may assume
,
, giving:
by applying AM-GM to the
terms.
- Inductive Step. By permuting and realignment, may assume
,
have positive measures, with
both being smaller than
, and that:
which concludes the proof:
Aside
Another common way to prove it is to first use Prékopa–Leindler’s inequality (specialized to characteristic functions) to establish the multiplicative version, then derive the additive version. The interested readers can consult this lecture note.
2.3. Grünbaum Inequality
I highly recommend checking out this lecture note from this course by Lap Chi Lau.
Grünbaum Inequality
Given convex body
with COG at
, we have
.
The proof is a series of clever reductions to simpler cases. It would be beneficial to give definitions to these cases. Given a nonnegative function
with support a closed interval
containing
, we can define a solid of revolution, given by:
Note that
is convex iff
is concave. We will see that this is the entry point of Brunn-Minkowski’s inequality in the proof of this theorem.
2.3.1. Special Case: Cone
We show the inequality in the case of cones - that is, in the case where
with
and with
being linear. May assume
. We get
by the following equation utilizing the condition that the COG of
is
:
which gives:
2.3.2. Special Case: Convex Solids of Revolution
Now we show the inequality in the case of solids of revolution. Suppose the convex body
has the form
, then
is concave, say with support
. Associate to
a function
, with the following properties:
with
containing
is non-negative, linear on
,
,
then by concavity of
and these conditions, one can show that:
dominates
on
.
dominates
on
.
Therefore, the COG of
is non-positive, so the special case for cones gives:
2.3.3. General Case: Reduction to Solids of Revolution
For a general convex body
, consider the function
, given by:
By Brunn-Minkowski, it is a compactly-supported concave non-negative function, so by the identity
, we are done by considering the solid of revolution
6:
Bubeck, Sébastien. “Convex optimization: Algorithms and complexity.” Foundations and Trends® in Machine Learning 8.3-4 (2015): 231-357. ↩︎
Ellipsoid Method, on this site. ↩︎
Gardner, Richard. “The Brunn-Minkowski Inequality.” Bulletin of the American mathematical society 39.3 (2002): 355-405. ↩︎
For generalists, we remind that this theorem holds in much general settings. Our assumptions guarantees that
(which is also a convex body) is measurable with positive finite volume, which would avoid unfun cases. ↩︎
One can also show that multiplicative version implies the additive version. We don’t need it here though. ↩︎
This is often called the “symmetrization” of
in the literatures. ↩︎