Tutorial 1: Gradient effects within binary systems

This post provides a visual example of gradient behaviour within a univariate binary system.

Here I demonstrate what two binary groupings look like when each binary is separated on a non-dimensional scale of 1 standard deviation for each binary, with a separation of 6 standard deviations. Such a binary has an overlapping coefficient of 0.27%, as seen from the code below, which was computed from integration based on Weitzman’s overlapping distribution.

## [1] "0.27%"

But the overlapping range hides the fact that in a group of, say, 10,000 for each binary, the outlier overlap is often enormous, and sometimes individual tokens look like they belong firmly in the other binary choice – like the one blue dot in the gold cloud. (Note that the y-axis is added to make the display easier to understand, but provides none of the data used in this analysis.)

In short, in a binary systems, individual tokens that exist thoroughly within the other binary range will exist due to simple random variation, yet they do not present evidence of constant gradient overlap or against the existence of the binary. Such things occur as long as the two binaries are close enough in relation to the number of examples – close enough being determined by simple probability, even in a univariate system (one without outside influences.)

The RMarkdown file used to generate this post can be found here. Some of the code was modified from code on this site.


Weitzman, M. S. (1970). Measures of overlap of income distributions of white and Negro families in the United States. Washington: U.S. Bureau of the Census.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.