Benford’s Law and the power-spectra of natural images

Probability of each of the digits 1 to 9 appearing as the first digit in a number to base 10. Image: Mark R. Diamond

Probability of each of the digits 1 to 9 appearing as the first digit in a number to base 10. Image: Mark R. Diamond

What is the relationship between the power spectra of natural images and Benford’s Law? The preceding question marks this out immediately as a posting for statisticians and vision researchers, but before looking for an answer to the question, one might reasonably ask why anyone would assume that there might be such a connexion. It is on that issue that the rest of this posting focusses.

Benford’s Law [1], which is actually attributable to Simon Newcomb [5], is also known as the “first digit phenomenon”. Specifically, it concerns the fact that in many tables of physical constants and statistical data, but especially in aggregations of such tables, “… the occurrence of numbers is such that the mantissae of their logarithms are equally probable” [5] and that benford

In the five decades following Benford’s publication, dozens of people produced flawed, or inadequate, explanations of the first-digit phenomenon. Nonetheless, beginning in 1995 Theodore Hill [2,3,4] published a series of papers related to Benford’s Law, the culmination of which is a theorem that is likely to become almost as important as the Central Limit Theorem.

Firstly, Hill shows that scale-invariance implies Benford’s Law. Scale invariance corresponds to the notion that, if the first digits of statistical tables follow some universal law, then that law should be independent of the units of measurement chosen. In other words, if data on river areas follow Benford’s Law, then it should not matter whether the areas are measured in square miles, square kilometres, or toe-nail lengths! Secondly, and most importantly, Hill was able to show that if distributions are selected at random in an “unbiased” way, and random samples are then taken from each of these distributions, then the significant digits of the combined sample will converge to the logarithmic distribution, and hence conform to Benford’s Law. Hill then goes on to suggest that rivers, being independently formed by diverse natural processes, might have areas that represent a sample from “random distributions selected in an unbiased way”. Similarly, the way in which natural objects overlap, hide and change one another also (probably) leads to their measurements representing random samples from randomly selected distributions. So much for the first part of the question which is at the heart of this posting.

Now for part two. Many vision researchers [6] have noted that the power spectrum of natural images usually takes the form of a power law in the spatial frequency. Specifically, the power of spatial frequency f is approximately C/(f^(2-eps)), where C is simply a scaling constant that is determined by the image contrast, and eps is a small positive value approximately equal to 0.19. As a consequence of the power-law, natural images also show scale-invariance in their power-spectra. Dan Ruderman [7] showed that artificial images, constructed by creating collages of statistically independent overlapping “objects” whose sizes followed a power-law distribution, also show a power-law distribution in their power spectra. But of course, there is no obvious reason for assuming that two dimensional collages of objects, whose sizes are deliberately chosen to follow one specific distribution, shows anything fundamental about a less carefully constructed, natural, three dimensional world.

Nonetheless, given that Benford’s Law and the power-spectra of natural images both appear to arise as a consequence of object overlap, the two ought to be inherently related. What is missing, however, is a mathematical proof that the images of agglomerations of objects whose sizes are selected at random from distributions which themselves are chosen at random in an unbiased way, will show the same power-spectra as natural images. If that can be done, then Benford’s Law and image power-spectra will be seen to arise from the fundamental mathematical structure of the universe.

I do not have the requisite statistical knowledge even to commence an appropriate analysis, but I have created a variety of computer simulations that strongly suggest that an analysis (or proof?) is worth attempting.

References

[1] Benford, F. (1938). The law of anomalous numbers. Proceedings of the American Philosophical Society, 78, 551-572.

[2] Hill, T. (1995). The significant-digit phenomenon. American Mathematical Monthly, 102, 322-327.

[3] Hill, T. (1995). Base-invariance implies Benford’s Law. Proceedings of the American Mathematical Society, 123(3), 887-895.

[4] Hill, T. (1996). A statistical derivation of the significant-digit law. Statistical Science, 10, 354-363.

[5] Newcomb, S. (1881). Note on the frequency of use of the different digits in natural numbers. American Journal of Mathematics, 4, 39-40.

[6] Ruderman, D. L., & Bialek, W. (1994). Statistics of natural images: scaling in the woods. Physical Review Letters, 73(6), 814-817.

[7] Ruderman, D. L. (1997). Origins of scaling in natural images. Vision Research, 37(23), 3385-3398.

Contributors: Mark R. Diamond