One of the most famous problems in additive combinatorics is the one of determining the size of the largest subset of whose difference set contains no non-zero perfect squares of integers. Here . Answering a question of Lovasz, Sarkozy and Furstenberg independently showed such a set must always have asymptotic density zero. Sarkozy’s proof was based on the circle method, and actually gave the quantitative bound
whereas Furstenberg’s approach relied on ergodic theory. Nowadays, there are quite a variety of proofs of the qualitative result , and the problem of finding strong quantitative estimates has become more or less known as the Furstenberg-Sarkozy problem.
This blog post is not going to be about this problem, but I would be remiss to proceed further to talk about a variant of it without mentioning a beautiful recent paper due to Thomas Bloom and James Maynard, where they show that all sets whose difference set contains no nonzero perfect square must always satisfy
for some absolute constant . Here is the usual Vinogradov notation (i.e. you can read as Big O) . This result manages to improve a bit upon a remarkably stubborn previous bound due to Pintz, Steinger and Szemeredi, which only had an extra log in the exponent of the denominator but which resisted throughout the recent years, despite the multiple developments around the related theorem of Roth about subsets of without three-term arithmetic progressions.
Without getting into too much detail about how the two problems are connected, it is perhaps at least worth discussing a couple of interesting distinguishing features. One first remarkable thing about these strong quantitative bounds for the Fursternberg-Sarkozy problem is that they are of noticeably better quality than what is currently known for the size of the largest subset of without nontrivial three term arithmetic progressions, despite both worlds following a Fourier-analytic density increment argument that shares the same core philosophy (see the recent exciting paper of Bloom and Sisask for more information on the latter). Furthermore, there is no Behrend-type construction available for the Furstenberg-Sarkozy problem, the largest known example of a subset of with no perfect squares in the difference set only having size . This is due to Lewko, who refined a construction due to Ruzsa. It is a popular belief among additive combinatorialists that there might exist an absolute constant with the property that all sets without nonzero perfect squares in must satisfy
but that currently seems well out of reach. Moreover, it is not entirely inconceivable that this (already rather bold) conjecture could even hold with .
There is a lot more to say of course, but, before I get completely carried away with this, I would now like to switch gears and discuss an intriguing variant of the above problem, which asks about the size of the largest such that the difference set contains no nonzero integer with only digits and/or when represented in base . This question was raised by Terence Tao and can be found as Problem 13 in a very nice survey by Thai Hoang Le.
Despite the seemingly haphazard nature of the set of integers without digits equal to in base , which from now on I will denote by , I’d like to begin by emphasizing that this is a very natural (and rather cool) variant of the Furstenberg-Sarkozy problem for several reasons. First and foremost, like the perfect squares and the many other “intersective” sets for which similar stories hold (such as the set of cubes or values of various classes of polynomials), is also sparse set that contains in a rather natural way. It is also not difficult to note that its structure is perfectly suited for an application of the powerful Density Hales-Jewett theorem, which yields the (morally) qualitative result . In the same fashion, the problem of finding a more reasonable quantitative bound therefore arises.
At the same time, unlike the squares and the other intersective sets, for which the Fourier transform exhibits analogous analytic behavior, the situation is less algebraic in flavor, and it seems a bit more complicated to use its structure in order to adapt any sensible density increment strategy. Nevertheless, (unrelated) recent work of Maynard on primes with restricted digits comes to mind, where analytic properties of a similar set turn out to be manageable enough to establish that there are infinitely many primes without a fixed digit in base or higher -so perhaps there is some hope. I started thinking about this problem a bit a while ago, and, while I didn’t have much luck with the Fourier analytic approach at the time (or rather didn’t have enough energy to work on it more seriously..), I had the pleasant surprise to realize that the quantitative problem exhibits a rather different behavior than the Furstenberg-Sarkozy problem, and that establishing (some type of) quantitative upper bound is a much easier problem than it might look initially. I thought the other day that it might be nice to share this story here, since, as usual, it’s been a while since my last post!
Indeed, let me begin by sketching a surprisingly simple argument, in the style of Blichfeld’s proof of Minkowski’s theorem, which readily shows that
holds for all sets with The main idea is to consider the following translates of the set : for each , define
where . Note that for each , the set is contained in . Moreover, it is easy to see that the hypothesis on also implies that for each , the sets and are pairwise disjoint. Hence, , which implies the claim.
Needless to say, this is not of the same quality as the bound of Maynard-Bloom (or even the one of Pintz, Steinger and Szemeredi) for the Furstenberg-Sarkozy problem, but it is already superior to what one may get by attempting to adapt the initial density increment strategies (e.g. the original one of Sarkozy). In fact, the upper bound already almost matches the quality of the best known quantitative result for Roth’s theorem (although that seems like a pure coincidence).
Regarding lower bounds, one can consider the set in which consists of all the numbers with digits in base with constant sum of digits, for some fixed . It is then not difficult to see that for an appropriate chosen constant, this yields a set whose difference set contains no integers missing digit when written in base and of size size . Perhaps a bit less coincidentally this time, when , this yields a construction with
which matches the size of the Behrend 3AP-free set.
All in all, this means that the size of the largest subset of with containing no nonzero integer without digit in base is somewhere between and . This is a smaller gap than the one between the best upper and lower bounds for the Fursterberg-Sarkozy problem, but it would still be very nice if we could close it even further –especially since it seems quite likely that one should be able to do better than for the upper bound using some Fourier analysis.
Another intriguing direction seems to be the analogous question for longer progressions. For fixed , suppose is a set which contains no nontrivial -term arithmetic progressions with common difference in , i.e. no pattern of the form
where is a nonzero element of . A standard application of the Density Hales-Jewett theorem yields the more general result that
However, can one prove any interesting quantitative bounds when ? The analogous generalization of the Furstenberg-Sarkozy problem has been studied extensively, see for example this paper of Prendiville (and the relevant information therein). Based on the situation, it seems however quite likely that one may not need to use any kind of (higher order) Fourier analysis in order to come up with a reasonable upper bound (possibly even of the form for some constant ).
Last but not least, it is perhaps worth mentioning that the finite field versions of all these problem are also quite interesting. The problem of determining the size of the largest subset with avoiding all the vertices of the hypercube which have at least one nonzero coordinate is a classical problem in extremal combinatorics. The best known results can be summarized in the following one line:
The upper bound comes from a paper of Huang, Klurman, and myself, while the lower bound is due to Alon, and in fact is the main source of inspiration for the construction of the large with . See Theorem 5 in the survey of Le already mentioned above to find out how this goes, although, if you like the problem(s) so far or agadmator chess videos, you might also want to pause for a bit and try to find it yourself at this point! Showing that is also an elegant application of the polynomial method.
For , the size of the largest subset with no nontrivial arithmetic progressions with common difference in the hypercube is quite mysterious. An application of the Density Hales-Jewett theorem yields, like in the story, a qualitative bound of the form . That might be the best known result (as far as I can tell), however, in light of the case and the exciting developments around Croot-Lev-Pach and the cap set problem it seems quite reasonable to expect that one could do much better. For example, someone (Peter Pach?) told me a while ago at a conference that Seva Lev believes
must hold for some absolute constant .
Later edit: I’d also like to sketch an alternate proof for fact that if is so that , then
This is a worse bound by a square root of the log in the denominator than the one discussed above, but perhaps it has some better chances of generalizing for the problem about sets without longer progressions without common difference in . The idea is to proceed like in Erdos’ argument for the classical Littlewood-Offord problem. Note that the intersection of with has size at most . This follows from Sperner’s theorem. Indeed, for each number in , let be the vector of -digits in the base representation of , and consider the subset
Since , it is easy to see that for every two in , the corresponding sets and are not contained in one another; hence forms an antichain. It thus indeed follows by Sperner’s lemma that has size at most
Next, note that the same inequality applies to every shift by the same argument. In particular,
Since we can cover by translates of , the claim follows by applying this inequality for each piece.