Center of Gravity Method
Todo Items
Add plots and diagrams illustrating proof ideas (as toy examples).
In this article, I will give a brief guide on what the center of gravity method (COG) is. Again, a good reference would be the note by Sébastien Bubeck1.
We will use the same settings and borrow parts of our exposition from the ellipsoid method2, where the convergence analysis follows along the same line as in that note’s section 1.3.
1. Setting the Stage
1.1. First Comparison with the Ellipsoid Algorithm
A high level description of the ellipsoid algorithm is that total control of a part of the feasibility region - the “onion peels” - are gained within each iteration via slicing along suitable directions (either it being a direction for a separating hyperplanes, or a direction given by subgradients), and by volume and shrinking arguments, as the uncontrolled feasibility region shrinks, the optimal value on regions on the peels, would not be too far from that from the uncontrolled region.
The “ellipsoidal” parts comes in the part where we want to gain control to a next peel, which uses ellipsoids as proxies. In center of gravity method, as one can guess it - we use center of gravities.
1.2. The Algorithm
Let us use again to denote the “peel” and the “remaining onion” at time .
Initiallize: , , to be any point in .
Halving:
Call COG oracle (here we think of it as a zeroth order oracle) to obtain the COG:
Call subgradient oracle obtain . If , optimality condition reached, terminate. Otherwise continue.
Update:
- ; which has the property:
- the argmin of .
One has the following:
COG Method: Convergence
We have the following bound:
Some remarks:
- The step “Compute center of gravity” is a highly nontrivial step in general if one wants exact solutions (e.g., think of the problem of computing the COG of a simplex with vertices). In such sense, this algorithm is more of theoretical importance then of practical use.
- The volume inequality during the passage from to will be shown below - it is related to the so-called Grünbaum Inequality, which can be proved by also another famous inequality called Brunn-Minkowski Inequality.
- The analytical complexity to guarantee an accuracy of is . Compared to ellipsoid method’s , both having linear rate of convergence, it is less sensitive to the underlying dimension .
2. Math Behind the COG Method
The heart of the COG method lies within Brunn-Minkowski’s inequality and Grünbaum’s inequality. The latter inequality ensures that the peel we gain control of in each step of the COG method is large enough, while the first is an inequality relating two convex sets with their convex combination.
For the casual readers, you have already understand what the COG method does🎉!
For the more serious folks here, let us now dive into the beautiful maths.
2.1. Additional Notations
We introduce the following notations:
- Given and , we write: which is just a dirty notation for constructing cones, convex combinations. Some times we do not distinguish between elements and one-point sets.
- We write , the volume of the unit ball (in )
- Given a measurable subset , we write , which satisfies:
- Given with , we define as the set of elements with . We define analogously.
- Given , , we can consider the slice:
2.2. Brunn-Minkowski Inequality
A good reference would be this one by R. J. Gardner’s review3. The Brunn-Minkowski inequality in the case of convex bodies can be stated concisely as follows4:
Brunn-Minkowski Inequality
Fixed convex bodies .
- Additive. The function is concave on .
- Multiplicative. The function is log-concave on .
We will show Item 1. Item 2 follows from Item 15, by the fact that and differ by a constant, and that , are concave on places we are interested. To show Item 1, it suffices to show the following:
By definition of Lebesgue measures, it suffices to show the case where are finite union of sets of the form (boxes). Write as the least number of boxes needed to write as union of boxes, and assume . We prove by induction on :
- Base Case. If , may assume , , giving: by applying AM-GM to the terms.
- Inductive Step. By permuting and realignment, may assume , have positive measures, with both being smaller than , and that: which concludes the proof:
Aside
Another common way to prove it is to first use Prékopa–Leindler’s inequality (specialized to characteristic functions) to establish the multiplicative version, then derive the additive version. The interested readers can consult this lecture note.
2.3. Grünbaum Inequality
I highly recommend checking out this lecture note from this course by Lap Chi Lau.
Grünbaum Inequality
Given convex body with COG at , we have .
The proof is a series of clever reductions to simpler cases. It would be beneficial to give definitions to these cases. Given a nonnegative function with support a closed interval containing , we can define a solid of revolution, given by:
Note that is convex iff is concave. We will see that this is the entry point of Brunn-Minkowski’s inequality in the proof of this theorem.
2.3.1. Special Case: Cone
We show the inequality in the case of cones - that is, in the case where with and with being linear. May assume . We get by the following equation utilizing the condition that the COG of is :
which gives:
2.3.2. Special Case: Convex Solids of Revolution
Now we show the inequality in the case of solids of revolution. Suppose the convex body has the form , then is concave, say with support . Associate to a function , with the following properties:
- with containing
- is non-negative, linear on
- , ,
then by concavity of and these conditions, one can show that:
- dominates on .
- dominates on .
Therefore, the COG of is non-positive, so the special case for cones gives:
2.3.3. General Case: Reduction to Solids of Revolution
For a general convex body , consider the function , given by:
By Brunn-Minkowski, it is a compactly-supported concave non-negative function, so by the identity , we are done by considering the solid of revolution 6:
Bubeck, Sébastien. “Convex optimization: Algorithms and complexity.” Foundations and Trends® in Machine Learning 8.3-4 (2015): 231-357. ↩︎
Ellipsoid Method, on this site. ↩︎
Gardner, Richard. “The Brunn-Minkowski Inequality.” Bulletin of the American mathematical society 39.3 (2002): 355-405. ↩︎
For generalists, we remind that this theorem holds in much general settings. Our assumptions guarantees that (which is also a convex body) is measurable with positive finite volume, which would avoid unfun cases. ↩︎
One can also show that multiplicative version implies the additive version. We don’t need it here though. ↩︎
This is often called the “symmetrization” of in the literatures. ↩︎