Configurations with many incidences and no triangles

Given a set \mathcal{P} of points and a set \mathcal{L} of lines, both in \mathbb{R}^{2}, an incidence is a pair (p,\ell) \in \mathcal{P} \times \mathcal{L}. Answering a question of Erdos and Purdy, Szemeredi and Trotter proved the following important theorem:

\displaystyle I(\mathcal{P},\mathcal{L}) = O\left(|\mathcal{P}|^{2/3}|\mathcal{L}|^{2/3}+|\mathcal{P}|+|\mathcal{L}|\right).

The Szemeredi-Trotter theorem is one of the most remarkable results in incidence geometry not only because it is extremely useful in applications all across combinatorics, but also because it is one of the few optimal incidence theorems available. Indeed, there are two somewhat different constructions achieving the equality (up to constants): one due to Elekes and another one, less well-known, due to Erdos himself.

In this blog post, we will discuss an interesting result of Solymosi, which showcases something that these two constructions have in common (and as a matter of fact, something that all configurations with \Theta\left(|\mathcal{P}|^{2/3}|\mathcal{L}|^{2/3}+|\mathcal{P}|+\mathcal{L}\right) incidences have in common).

Theorem. (Solymosi) Let \mathcal{P} be a set of points and let \mathcal{L} be a set of lines, both in \mathbb{R}^{2}. Suppose that

\displaystyle I(\mathcal{P},\mathcal{L})=\Theta\left(|\mathcal{P}|^{2/3}|\mathcal{L}|^{2/3}+|\mathcal{P}|+|\mathcal{L}|\right).

Then, there must exist three non-collinear points in \mathcal{P}, with the property that every pair of points among them determines a line which is from the set \mathcal{L}.

From now on, we will also no-so-surprisingly refer to such local configurations simply as triangles, since that’s what they actually are in the plane, however note that this could easily become confusing if one is not careful, since graph theoretically these configurations actually represent copies of a cycle of length six in the incidence graph \mathcal{P} \times \mathcal{L}. Solymosi’s theorem can be therefore rephrased as saying that for every set of points \mathcal{P} and every set of lines \mathcal{L} which do not determine any triangles, we have that

\displaystyle I(\mathcal{P},\mathcal{L})=o\left(|\mathcal{P}|^{2/3}|\mathcal{L}|^{2/3}+|\mathcal{P}|+|\mathcal{L}|\right).

We will actually not dwell too much on the proof of Solymosi’s theorem, since in this post I would like to instead focus on a lower bound construction. Nonetheless, I’d like to at least record the strategy, since the idea is very simple and interesting. As a starting point, recall that the Szemeredi-Trotter theorem can be proved in a few ways by amplifying the trivial bounds I(\mathcal{P},\mathcal{L}) \leq |\mathcal{P}| + |\mathcal{L}|^{2} and I(\mathcal{P},\mathcal{L}) \leq |\mathcal{P}| + |\mathcal{L}|^{2}. Most famously, this has been done in an absolutely beautiful indirect manner by Szekely via the so-called crossing number inequality; however, more modernly, this has also been achieved in various other more practical ways by using partitioning theorems (such as the Guth-Katz polynomial partitioning theorem but also using older partitioning results such as the simplicial cutting theorem of Matousek (used also by Solymosi); see for instance these lecture notes of Guth or this blog post of Tao). Solymosi’s main observation is that under the additional assumption that the incidence graph \mathcal{P} \times \mathcal{L} contains no copy of a cycle of length six, one can improve on the trivial inequalities above even before the amplification step. More precisely, one can show for instance that I(\mathcal{P},\mathcal{L}) \leq 2|\mathcal{P}| + o(|\mathcal{L}|^{2}) as follows.

First, let \mathcal{P}' be the set of points in \mathcal{P} which lie on at most two lines from \mathcal{L}; note that

\displaystyle I(\mathcal{P},\mathcal{L}) = I(\mathcal{P}',\mathcal{L}) + I(\mathcal{P} \backslash \mathcal{P}',\mathcal{L}) \leq 2|\mathcal{P}| + I(\mathcal{P} \backslash \mathcal{P}',\mathcal{L}).

Then, consider the graph G(\mathcal{L}) with vertex set \mathcal{L} and with edges defined as follows: for each point x \in \mathcal{P} \backslash \mathcal{P}', consider the set of lines \mathcal{L}(x) which contain of x, and cover up the vertices corresponding to \mathcal{L}(x) in G(\mathcal{L}) by edge-disjoint triangles (up to a small error). The number of edges in G(\mathcal{L}) thus constructed is then roughly the same as the number of incidences I(\mathcal{P} \backslash \mathcal{P}',\mathcal{L}), so if for some reason I(\mathcal{P} \backslash \mathcal{P}',\mathcal{L}) = \Omega(|\mathcal{L}|^{2}), this would mean that G(\mathcal{L}) has \Omega(|\mathcal{L}|^2) edge-disjoint triangles. By the triangle removal lemma, having this many edge-disjoint triangles would further imply that our graph G(\mathcal{L}) must actually have \Omega(|\mathcal{L}|^3) triangles in total, and so some of these triangles would most definitely have to come from different points in \mathcal{P}, by design. It is easy to see that this would immediately be in conflict with the hypothesis that there are no cycles of length six in the incidence graph \mathcal{P} \times \mathcal{L}. Therefore, it follows that I(\mathcal{P} \backslash \mathcal{P}',\mathcal{L}) = o(|\mathcal{L}|^{2}), which gives the claim above. Amplfying this bound instead of the trivial bound, one can then recover Solymosi’s theorem.

We will now move on to describe a construction of a set points and a set lines with many incidences and no triangles, which shows that Solymosi’s theorem is in some sense sharp in the regime when |\mathcal{P}| \gg |\mathcal{L}|^{2-\epsilon} or |\mathcal{P}| \gg |\mathcal{L}|^{2-\epsilon} for every \epsilon > 0. The construction is in the style of Elekes’s construction for the original Szemeredi-Trotter theorem, but with an additive combinatorics twist to force out triangles.

A set of positive integers A is called a non-averaging set of order k if for each 1 \leq u,v \leq k, the equation

\displaystyle u x_{1} + v x_{2} = (u+v)x_{3}

has no solutions with x_{1}, x_{2}, x_{3} all in A and all pairwise distinct. Furthermore, let s_{k}(n) denote the size of the largest non-averaging set of order t inside the interval \left\{1,\ldots,n\right\}. In particular, note that s_{1}(n) \leq r_{3}(n), where the latter denotes as usual the size of the largest three-term progression free set inside \left\{1,\ldots,n\right\}. It is well-known that there exists some absolute constant c > 0 such that for every sufficiently large n one has

\displaystyle r_{3}(n) \geq n \exp{(-c \sqrt{\log n})},

thanks to a celebrated construction due to Behrend. Interestingly, it is not too hard to modify this construction to show that \left\{1,\ldots,n\right\} also contains very large non-averaging sets of order k, for every k which is reasonably small compared to n. Indeed, the following nice observation of Stanchescu holds.

Theorem (Stanchescu). For every k \geq 1, there exists some absolute constant c>0 such that

\displaystyle s_{k}(n) \geq n \exp{(-c \log k \sqrt{\log n})}

for every sufficiently large n.

We can now finally get to the construction. Let r and s denote some large positive integers, with a precise relationship between them to be specified later. For now, we shall only think of r as being much larger than s. Let A then be a non-averaging set of order s inside \left\{1,\ldots,r\right\} and satisfying

\displaystyle |A| \geq r \exp{(-c \log s \sqrt{\log r})}

for some absolute constant c > 0. Define the set of points \mathcal{P} \subset \mathbb{R}^{2} by

\displaystyle \mathcal{P} = \left\{(x,y):\ x \in A, y \in \left\{1,\ldots,2rs\right\}\right\}.

Let \ell_{m,n} be the line in \mathbb{R}^{2} defined by the equation y=mx+n, and let \mathcal{L} be the set of lines defined by

\displaystyle \mathcal{L} = \left\{\ell_{m,n}: m \in \left\{1,\ldots,s\right\}, n \in \left\{1,\ldots,rs\right\}\right\}.

Note that |\mathcal{P}| = 2 |A|rs, whereas |\mathcal{L}| = rs^{2}. Furthermore, note that for every line \ell_{m,n} in \mathcal{L} and every x \in A, there is a unique element mx + n in \left\{1,\ldots,2rs\right\} such that (x,mx+n) lies on \ell_{m,n}. In particular, each line \ell_{m,n} determines at least |A| incidences with P, so I(P,L) \geq |A|rs^2.

In particular, if say s \ll \exp((\log r)^{c}) for some c < 1/2, then |A| \gg r^{1-\epsilon} for every \epsilon > 0, and

\displaystyle I(\mathcal{P},\mathcal{L}) \geq |A|rs^{2} \gg r^{2-\epsilon}s^{2} \gg (r^{3}s^{3})^{\frac{2}{3}-\frac{\epsilon}{3}} \geq (|\mathcal{P}||\mathcal{L}|)^{\frac{2}{3}-\frac{\epsilon}{3}}.

We next verify that in this configuration there are no non-collinear three points in \mathcal{P}, with each pair lying on a line from \mathcal{L}. We argue this by contradiction. Suppose there are three lines \ell_{1},\ell_{2},\ell_{3} in \mathcal{L} and three points p_{1},p_{2},p_{3} in \mathcal{P} such that their vertices induce a cycle of length six in \mathcal{P} \times \mathcal{L}, and the points p_1, p_2, p_3 are not collinear. Without loss of generality, suppose the lines \ell_{1} and \ell_{2} meet at p_{1} := (x_1,y_1),, lines \ell_{2} and \ell_{3} meet at p_{2} := (x_2,y_2), and the lines \ell_{3} and \ell_{1} meet at p_{3} := (x_3,y_3).

Also, for each i \in \left\{1,2,3\right\}, let \ell_{i} be described by the equation y=m_{i}x+n_{i}, and without loss of generality assume that m_{1} \geq m_{2} \geq m_{3}. It is easy to check that by definition of p_{1}, we have that (m_{1}-m_{2})x_{1} + n_{1}-n_{2}=0, and we also have similar equations for p_{2} and p_{3}.

Adding up these equations, it follows that

\displaystyle (m_{1}-m_{2})x_{1} + (m_{2}-m_{3})x_{2}=(m_{1}-m_{3})x_{3}.

But x_{1},x_{2},x_{3} are all in A and recall A is a non-averaging set of order s, so this would imply that x_{1}=x_{2}=x_{3}. This is a contradiction, since we assumed that the points p_1, p_2, p_3 are not collinear.

It would interesting to see in the future (maybe a future blog post!) whether the triangle removal lemma could be combined instead with the crossing number inequality proof of Szemeredi-Trotter (rather than the cutting method proof) to give an alternate proof for

\displaystyle I(\mathcal{P},\mathcal{L})=o\left(|\mathcal{P}|^{2/3}|\mathcal{L}|^{2/3}+|\mathcal{P}|+|\mathcal{L}|\right).

Another natural question also seems to be whether one can improve quantitatively on the bound above by proving new bounds for the triangle removal lemma which are specific to this geometric setting. This of course may be difficult since the only data point where this type of thing has been accomplished so far is the arithmetic setup in vector spaces over finite fields.

On the Caro-Tuza-Wei inequality

I’d like to start off this blog with a post that is written jointly with my good friend Fedor Petrov.

To set things up, recall that a k-uniform hypergraph H represents a pair (V(H),E(H)) where V(H) is a finite set of vertices and E is a family of k-element subsets of V(H). When k=2, H is simply called a graph. For v \in V(H), its degree in H, which we will denote as \deg_{H}(v), is defined to be the cardinality of the set of edges in E(H) which contain v. A subset S\subset V(H) is an independent set of the hypergraph H if there is no edge in E(H) which has all of its k elements that describe it as part of S. Equivalently, this is sometimes written as {S \choose k} \cap E(H) = \emptyset. Furthermore, the independence number of H, denoted by \alpha(H), represents the maximum size of an independent set in H, and lower bounding it in terms of the degrees of the vertices in H is the main subject of this post.

Estimating the independence number of (hyper)graphs under various hypotheses is a classical and important topic in graph theory, since many extremal combinatorics problems can be rephrased as inequalities about \alpha(H) for carefully defined hypergraphs H. We will not aim to give a complete background for this general problem, so we refer the reader to The Probabilistic by Alon and Spencer or a nice paper of Dutta, Mubayi, and Subramanian for more appropriate accounts. The story I’d like to tell here begins with the beautiful result of Caro and Wei, who independently proved that for every graph G,

\displaystyle \alpha(G) \geq \sum_{v \in V} \frac{1}{\deg(v) + 1}.

In particular, after an application of the Cauchy-Schwarz inequality, this implies the well-known Turan’s theorem about the largest number of edges a graph without a complete subgraph of a fixed size can have. Before the Caro-Wei theorem, Spencer extended Turan’s theorem to k-uniform hypergraphs, so extending the above inequality to k-uniform hypergraphs (and thus improving Spencer’s theorem) became a natural next step. Indeed, Caro and Tuza were able to prove the following generalization, which I will take the liberty to refer to as the Caro-Tuza-Wei inequality.

Theorem 1 (Caro-Tuza). Let k \geq 1 be an integer and let H=(V,E) be a (k+1)-uniform hypergraph. Then,

\displaystyle \alpha(H)\geqslant \sum_{v\in V} \frac{1}{{\deg(v)+1/k\choose\deg(v)}}.

For k=1 this recovers the Caro-Wei theorem, but the proof is much more involved. In this post, we will give a short probabilistic proof of this theorem, by extending the celebrated proof of the Caro-Wei theorem from The Probabilistic by Alon and Spencer [Theorem 1, page 91]. It is important to mention that this is not the first time when this kind of idea is attempted. In fact, in the paper I mentioned above, Dutta, Mubayi and Subramanian generalized the proof of Caro-Wei from Alon and Spencer in an interesting (but rather complicated) way to show that there exists some positive constant d_{k} such that for every k+1 uniform hypergraph H, we have

\displaystyle \alpha(H) \geq d_{k} \sum_{v \in V(H)} \frac{1}{(\deg(v) + 1)^{1/k}}.

This is an easy (and valuable) consequence of the Caro-Tuza theorem, but it seems that the method of Dutta, Mubayi and Subramanian does not quite yield the precise version of Theorem 1. We present an alternative (simpler) approach which achieves this feature.

Probabilistic proof of the Caro-Tuza-Wei Inequality

For each v \in V, let \xi_v be an independent uniform random variable in [0,1]. Let \xi denote the |V|-tuple (\xi_{v})_{v \in V} \in [0,1]^{|V|}. For every edge e\in E in our k-uniform hypergraph H, remove a vertex v from e for which \xi_v=\max_{u\in e} \xi_u. Let S_{\xi} denote the remaining set of vertices. It is easy to check that S_{\xi} is an independent set in H. We claim that the expected value of S_{\xi} over all choices of \xi = (\xi_{v})_{v \in V} in [0,1]^{|V|} is at least the RHS expression from Theorem 1.

Let’s check this. For each vertex v \in V, we estimate the probability that v remains in S_{\xi}. Denote the degree of v by m. For a fixed value \tau \in [0,1] of \xi_v, and a given edge e containing v, the probability that e does not remove v is equal to the probability that there exists u\in e with \xi_u>\tau, which equals 1-\tau^k. Clearly these events for different edges have positive correlation: if some of them hold, it may only help the others to hold. For a linear (k+1)-uniform hypergraph H (which is a (k+1)-uniform hypergraph with the extra property that every two edges have at most one common vertex) they are truly independent. Thus the probability that they hold simultaneously is not less than (1-\tau^k)^{m}, with equality in the linear case. Therefore, the probability that v survives in S_{\xi} is at least the integral \int_0^1 (1-\tau^k)^m d\tau, which we can compute as follows. After two changes of variables, note that

\displaystyle \int_0^1 (1-\tau^k)^m d\tau = \frac1t\int_0^1 (1-\theta)^m\theta^{1/k-1}d\theta = \frac1t \cdot B(m+1,1/k),

where the function B(x,y) stands for the usual beta function

\displaystyle B(x,y) = \int_{0}^{1} \Theta^{x-1}(1-\Theta)^{y-1} d\Theta,

defined for all complex numbers x and y with positive real parts. If

\displaystyle \Gamma(z) = \int_{0}^{\infty} \Theta^{z-1}e^{-\Theta} d\Theta

denotes the standard gamma function, then it is well-known and easy to check that

\displaystyle B(x,y) = \frac{\Gamma(x)\Gamma(y)}{\Gamma(x+y)}.

Indeed, one can write

\displaystyle \Gamma(x)\Gamma(y)= \int_{0}^{\infty} \Theta^{x-1}e^{-\Theta}d\Theta \int_{0}^{\infty} \theta^{y-1}e^{-\theta}d\theta = \int_{0}^{\infty}\int_{0}^{\infty}\Theta^{x-1}\theta^{y-1}e^{-(\Theta+\theta)}d\Theta d\theta.

Using the substitutions \Theta = vt and \theta=v(1-t), we have

\displaystyle \Gamma(x)\Gamma(y) = \int_{0}^{1}t^{x-1}(1-t)^{y-1}dt \int_{0}^{\infty} v^{x+y-1}e^{-v}dv.

The latter expression is precisely B(x,y)\Gamma(x+y). It thus follows that

\displaystyle \frac{1}{t} \cdot B(m+1,1/t) = \frac{1}{t} \cdot \frac{\Gamma(m+1)\Gamma(1/t)}{\Gamma(m+1+1/t)} = \frac{1}{{m+1/t \choose m}}.

Putting these together, we conclude that for each v \in V,

\displaystyle \mathbb{P}\left(v\ \text{survives in}\ S_{\xi}\right) \geq \int_0^1 (1-\tau^k)^m d\tau = \frac1{{m+1/t\choose m}}.

By linearity of expectation, it then follows that

\displaystyle \mathbb{E}_{\xi \in \mathbb{T}^{|V|}}[|S_{\xi}|] = \sum_{v \in V} \mathbb{P}\left(v\ \text{survives in}\ S_{\xi}\right) \geq \sum_{v\in V} \frac{1}{{\deg(v)+1/k\choose\deg(v)}},

so there is some choice of \xi = (\xi_{v})_{v \in V} \in [0,1]^{|V|} for which the corresponding independent set S:=S_{\xi} has size at least \sum_{v\in V} \frac{1}{{\deg(v)+1/k\choose\deg(v)}}. This concludes the proof.