## Leonhard Euler

## Sewall G. Wright

## Tan Yıldız

Thirty three years ago on this very day, the world-renown population geneticist and eminent fellow of the Royal Society, Sewall Wright, passed away. The two statistical models that he developed early in the twentieth century had cascading effects beyond his own field of inquiry and are currently indispensable modi aperandi of many study models of various scientific fields. These are path analysis and F-statistics, among which the latter is more suitable for population genetic studies whereas the former is applicable to a motley of fields ranging from sociology to developmental psychology. Apart from his statistical contributions, however, his legacy is justly acclaimed for he had been one of the influential figures in ushering in the new “Neo-Darwinian” paradigm in evolutionary biology.

Path Analysis

Wright developed path analysis in the early 1920s as a novel model based on multiple regression analyses apt for gauging inbreeding coefficients of cattle and guinea pigs that were frequently the subjects of his studies throughout his early career. As if to make this model more perspicuous, he opted for graphical demonstration by using correlation matrices to map the correlations between the independent and dependent variables. He coined the term “the method of path coefficients” for this new model of regression analysis in a monumental paper in which he discussed his rationale for having developed such a nascent model of analysis and lucidly explained its potential benefits in isolating the causal effects of independent variables on the dependent variable (Wright 1921). Put simply, he was compelled to come up with this new model due to the confounding effects of independent variables neglected in a given study, effectively producing findings that are spurious and thus unreplicable in addition to providing no indication of causality in the cases where such findings are veritable. An apposite example of the kind Wright was alluding to is most fittingly presented by Plomin (2019):

“A third factor might set up the correlation between them. A classic example is the correlation between the number of churches in cities and the amount of alcohol consumed. Religion does not drive you to drink, nor does drinking make you more religious. The correlation is caused by the size of cities: because larger cities have more people, they have more churches and greater consumption of alcohol. Once you control for this third factor, there is no association between the number of churches and the amount of alcohol consumed.”

Although Plomin is expounding on the confounding effects of latent factors in this given excerpt, it is adequately similar to Wright’s concern about obtaining reliable magnitudes of the various causal “paths” between the given matrix of independent variables and the dependent variable. Intending to come up with such a structural equation model, he made use of Pearson’s equation for the partial correlation between two variables for a latent third factor:

where the variable C’s correlation with the variables A and B. This allowed for the estimation of the effect size of the confounding variable C, which in turn allowed for constructing a structural model mapping the causal “paths” with the coefficients of correlation between each given variable. To make his novel model into a graphic one, he chose a schematic representation of each independent and dependent variable with lines connecting each correlating variable along with the index of effect size -which he omitted and replaced with signs indicating whether the correlation is positive or negative between the given variables-:

[1]

This model permeated various fields as a polished tool for gauging the interrelations between various mutually confounding independent variables and a dependent variable. However, the model as he contrived it gradually became obsolete and was succeeded with causal modeling, which, needless to say, was beholden greatly to its predecessor for its theoretical and structural framework.

F-statistics

F-statistics, or fixation indices, are measures of the genetic differentiation of two populations due to their respective genetic architecture. In other words, it is the measure of the total variation for a given genetic pattern or a genetically influenced trait between the two given populations compared to the total variation within the respective populations. The three indices that constitute F-statistics were developed by Wright in 1922 for measuring the inbreeding coefficients of cattle. For this he derived the equation:

[3]

where fo represents the inbreeding coefficient, n and n’ respectively stand for the number of generations of inbreeding predecessors of the given individual sire and dam, and fa represents the inbreeding coefficient of the common ancestor.

An incontournable and significant contribution to the biostatistics of inbreeding, the fixation indices he had initially contrived for the study of inbreeding, especially the fixation index , were later adapted to the study of genetic variation within and between groups of individuals or populations. measures were most fruitfully used to gauge the fraction of genetic variation attributable to population subdivision and have been applied to study the genetic differences between infraspecific populations or subspecies. In the population genetic field, the most widely renowned and groundbreaking study that featured has been Luigi L. Cavalli-Sforza’s The History and Geography of Human Genes wherein he measured the differences between a panoply of genetically differentiated populations (Sforza 1994).

distances measured by Sforza (1994) x 10,000 between selected populations around the world. (Salter 2006)

Conclusion

Although his chief contributions have been in population genetics and evolutionary biology, such as his development of fitness landscapes that mapped the adaptive peaks and troughs of a given matrix of potential phenotypes, he pioneered along with J. B. S. Haldane and Ronald Fisher, the founder of biostatistics, the theoretical statistical foundations of the modern synthesis marrying Mendelian genetics and Darwinian evolution.

Wright with a Piebald guinea pig

(Mar. 3, 2021)

References

[1]: Wright, S. (1921). Causation and Correlation. Journal of Agricultural Research, 20(7), 557–585.

[2]: Plomin, R. (2019). Blueprint: How DNA Makes Us Who We Are (Illustrated ed.). The MIT Press.

[3]: Wright, S. (1922). Coefficients of Inbreeding and Relationship. The American Naturalist, 56(645), 330–338. https://doi.org/10.1086/279872

[4]: Piazza, L. L. C. M. (1994). The History and Geography of Human Genes by Luigi Luca Cavalli-Sforza (1994-07-05). Princeton University Press.

[5]: Salter, F. K. (2006). On Genetic Interests: Family, Ethnicity and Humanity in an Age of Mass Migration (1st ed.). Routledge. pp. 64.

Figure References

Sewall G. Wright: https://fineartamerica.com/featured/1-sewall-wright-american-philosophical-society.html

Wright with a Piebald guinea pig: https://blog.uvm.edu/cgoodnig/2014/05/22/sewall-wrights-seven-generalizations-about-populations/