Ewan Birney, Jennifer Raff,
Adam Rutherford, Aylwyn Scally
Human genetics tells us
about the similarities and differences between people – in our physical and psychological traits, and in our
susceptibility to disorders and diseases –
but our DNA can also reveal the broader story of our evolution, ancestry and
history. Genetics is a new scientific field, relatively speaking, merely a
century old. Over the last two decades, the pace of discovery has accelerated
dramatically, with exciting new findings appearing daily. Even for scientists
who study this field, it’s difficult to keep up.
Amidst this ongoing surge of
new information, there are darker currents. A small number of researchers,
mostly well outside of the scientific mainstream, have seized upon some of the
new findings and methods in human genetics, and are part of a social-media
cottage-industry that disseminates and amplifies low-quality or distorted
science, sometimes in the form of scientific papers, sometimes as internet
memes – under the guise of euphemisms such
as ‘race realism’ or ‘human biodiversity’. Their arguments, which focus on
racial groupings and often on the alleged genetically-based intelligence
differences between them, have the semblance of science, with technical-seeming
tables, graphs, and charts. But they’re misleading in several important ways. The
aim of this article is to provide an accessible guide for scientists,
journalists, and the general public for understanding, criticising and pushing
back against these arguments.
Human population structure
is not race
Racial categories, as most
people understand them today, have some of their roots in the development of
scientific thinking during only the last few centuries. As Europeans explored
and colonised the world, thinkers, philosophers and scientists from those
countries attempted to apply taxonomic structures to the people that they
encountered, and though these attempts were many and varied, they typically
reflected sharp geographic boundaries, and obvious physical characteristics,
such as pigmentation and basic morphology – that
is to say, what people look like. Research in the 20th century found
that the crude categorisations used colloquially (black, white, East Asian etc.)
were not reflected in actual patterns of genetic variation, meaning that
differences and similarities in DNA between people did not perfectly match the traditional
racial terms. The conclusion drawn from this observation is that race is therefore
a socially constructed system, where we effectively agree on these terms,
rather than their existing as essential or objective biological categories.
Some people claim that the
exquisitely detailed picture of human variation that we can now obtain by
sequencing whole genomes contradicts this. Recent studies, they argue, actually
show that the old notions of races as biological categories were basically correct
in the first place. As evidence for this they often point to the images
produced by analyses in studies that seem to show natural clustering of humans
into broadly continental groups based on their DNA. But these claims
misinterpret and misrepresent the methods and results of this type of research.
Populations do show both genetic and physical differences, but the analyses
that are cited as evidence for the concept of race as a biological category actually
undermine it.
Even though geography has
been an important influence on human evolution, and geographical landmasses
broadly align with the folk taxonomies of race, patterns of human genetic variation
are much more complex, and reflect the long demographic history of humankind.
This begins with our origin as a species – Homo
sapiens – in Africa within the last quarter of a million years or so, and is
then shaped by our continual mixing and movement throughout the world that
began within the last 80,000 years. This history means that the greatest amount
of genetic diversity – the oldest splits in the human genealogical ‘tree’ – are
found within Africa. If an alien, arriving on Earth with no knowledge of our
social history, wished to categorise human ancestry purely on the basis of genetic
data, they would find that any consistent scheme must include many distinct
groups within Africa that are just as different from each other as Africans are
to non-Africans. And they would find it difficult to identify any natural or
obvious subdivision of people into groups which accurately partitions human
genetic variation due to the constant migrations of people across the world.
Furthermore, there isn’t
really a human ‘tree’. Although we use this arboreal metaphor to describe
ancestry and evolutionary relationships, the true structure of human ancestry is
far more convoluted. Human populations have continued to diverge, expand and
interact throughout the last 100,000 years, resulting in a continuously
branching and looping ancestral structure: the real history of Homo sapiens
is more like an overgrown thicket than a stately branching tree. Much of the
population structure that we see today in ancestry testing results dates back
only to a few thousand years or less. For example, the majority of European genomes
are a mixture of at least three major groups within the last 10,000 years: the
early hunter-gatherers who first populated the continent, a second wave of
ancestry from the Near East associated with the spread of farming; and a third
contribution from north Eurasia during the Bronze Age (2000–500 BCE).
Geneticists use a variety of
tools to visualise the subtle and complex patterns of genetic variation between
people, and to mathematically cluster them together based on relatedness. Such
methods are helpful for exploring data, but have also been the source of wider
confusion. For example, Principal Component Analysis (PCA) plots often show
distinct, colourful clusters of dots that appear to separate groups of people
from different parts of the world. In some cases, these clusters even seem to
correspond to traditional racial groupings (e.g. ‘Africans’, ‘Europeans’ and
‘Asians’). It is images such as these which are often deployed as genetic
evidence for the existence of separate races. But these methods can be
misleading in ways which non-experts – and even some specialists – are unaware
of. For example, some of the observed genetic clustering is a reflection of the
samples that were included in the study and how they were collected, rather
than any inherent genetic structure. DNA sample collection typically follows
existing cultural, anthropological or political groupings. If samples are
collected based on pre-defined groupings, it’s entirely unsurprising that the
analyses of these samples will return results that identify such groupings.
This does not tell us that such taxonomies are inherent in human biology.
Some ‘human biodiversity’
proponents concede that traditional notions of race are refuted by genetic
data, but argue that the complex patterns of ancestry we do find should in
effect be regarded as an updated form of ‘race’. However, for geneticists,
other biologists and anthropologists who study this complexity, ‘race’ is simply
not a useful or accurate term, given its clear and long-established implication
of natural subdivisions. Repurposing it to describe human ancestry and genetic
structure in general is misleading and disingenuous. The term ‘population’ is
used in many contexts within the modern scientific literature to refer to
groups of individuals, but it is not merely a more socially acceptable
euphemism for race.
It is often suggested that
geneticists who emphasise the biological invalidity of race are under the thumb
of political correctness, forced to suppress their real opinions in order to
maintain their positions in the academy. Such accusations are unfounded and
betray a lack of understanding of what motivates science. Discoveries,
particularly in biology, have often been challenging or difficult for society
to accept, and scientists throughout history are celebrated for establishing
them in the face of contemporary objections. Indeed, the biological invalidity
of traditional racial categories runs counter to many people’s lived
experience, and is in itself a morally neutral conclusion. If the evidence is
sound, scientific integrity demands that it is published. The charge that
thousands of scientists across the world are covering up a real discovery for
fear of personal or wider social consequences is absurd. Furthermore, it is
important to distinguish understanding the world around us using science, from the
rules, distribution of funds and policies in society. The goal of scientists is
to provide that understanding. At the same time, we appreciate that societies
determine their principles and policies informed by, but independent of
science.
Traits, IQ and genetic
diversity
Traits and characteristics
vary among individuals within and between different parts of the world,
sometimes in ways which are visible, such as with height or pigmentation, and
sometimes in other more cryptic ways, such as with disease susceptibility.
Understanding how genomes influence traits is a major aspect of genetic
research.
There are countless traits
one can measure in humans, but none more controversial than those associated
with intelligence, such as IQ. ‘Human biodiversity’ proponents tend to fixate
on IQ, and one can speculate about why this is and what conclusions they wish
to draw; however, it should be noted that IQ itself is a valid and measurable
trait. Critics often assert that it is an oversimplified metric applied to a
far-too-complex set of behaviours, that the cultural-specificity of tests renders
them useless, or that IQ tests really only measure how good people are at doing
IQ tests. Although an IQ score is far from a perfect measure, it does an
excellent job of correlating with, and predicting, many educational, occupational,
and health-related outcomes. IQ does not tell us everything that anyone could
want to know about human intelligence – but because definitions of
“intelligence” vary so widely, no measure could possibly meet that challenge.
‘Human biodiversity’
proponents sometimes assert that alleged differences in the mean value of IQ
when measured in different populations – such as the claim that IQ in some sub-Saharan
African countries is measurably lower than in European countries – are caused
by genetic variation, and thus are inherent. The purported genetic differences
involved are usually attributed to recent natural selection and adaptation to
different environments or conditions. Often there are associated stories about
the causes of this selection, for example that early humans outside Africa
faced a more challenging struggle for survival, or that via historical
persecution and restriction of professional endeavours, Ashkenazi Jews harbour
genes selected for intellectual and financial success.
Such tales, and the claims
about the genetic basis for population differences, are not scientifically
supported. In reality for most traits, including IQ, it is not only unclear
that genetic variation explains differences between populations, it is also
unlikely. To understand why requires a bit of background.
It is certainly the case
that some traits are the result of local or regional adaptation, corresponding
to differences in particular genes. Indeed, one of the reasons for humankind’s
success as a global species is local adaptation. The majority of this adaptation
is via behaviour and the cultural transmission of successful behaviours, but
there are also cases where the adaptation is genetic, that is, small
modifications occur within our genomes that enhanced survival in different
environments. For example, genetic changes have meant that coastal populations
have DNA variants that help them more readily process diets that are rich in
oily fish; pastoralist farmers all over the world evolved the ability to
metabolise milk after weaning, largely through genes that continued to produce
a particular enzyme into adulthood that would otherwise be switched off by the
age of five. Lighter skin evolved to allow more sunlight, and thus Vitamin D
synthesis, into our bodies as we migrated away from the equator. We can see
these local adaptations in our DNA. But they only hold for a minority of
traits. Most traits have very real genetic and physical differences between
individuals, but any group differences do not correspond to traditional race categories
such as height, or the susceptibility to type 2 diabetes in an environment with
ready access to food.
For traits caused by
regional adaptation, contemporary genetic techniques now allow us to see clear
evidence for recent selection on new genetic variants or patterns at particular
locations in the genome. However, such cases are atypical: most traits have no
obvious or localised signal of recent selection. The lack of regional adaptation
does not hinder genetic approaches, and all traits (whether under recent
adaptive selection or not) can be studied by analysing large numbers of people.
The Genome-Wide Association Study (GWAS) is a powerful tool for finding genetic
variants associated with all sorts of human traits. GWAS researchers take a
group of people with differing values or levels of a trait of interest, and
scan their whole genomes to look for specific sections of DNA where their genetic
variation correlates with their variation in the trait. For most traits, the
GWAS results are complicated. Unlike in more straightforward cases like Sickle
Cell Anaemia, where you’d find a big spike of statistical significance in one
particular gene (the beta-globin gene, whose variation is the primary cause of
the disease), GWAS results typically implicate many thousands of positions in
the genome that, in aggregate, build towards the probability of having a
disease or some level of a particular trait. And so, for height, or heart
disease, or schizophrenia or other complex conditions, we see many small spikes
of significance dotted around the genome – so many that we can’t single out
individual genes or sections of DNA that sometimes get characterised as “the
gene for” that particular outcome. Each of the large number of places across
the genome which we associate with a trait contribute a small amount, but
collectively the sum of all these effects means that there is in aggregate a
substantial genetic influence on how the trait varies between people.
However, GWAS and other
similar approaches are affected by population structure, and hence face the
same issues of dependence on sampling and confounding with cultural factors
mentioned above. Most GWAS approaches have been carried out in populations sampled
from across Europe, and have ancestries consistent with this sampling. In many
cases though, only certain subsets of people are included in these analyses –
for good scientific reasons. For example, samples of “European” populations
used in genetic studies often have excluded up to as many as 30% of
self-identified Europeans. This is because some individuals introduce hard-to-model
complications into the data, forming distinct sub-clusters or complicating the
genetic model. For example, Finns and Sardinians are often excluded as they
have quite distinct genetic ancestries compared to many other Europeans, as are
some people in India, north Africa, Latino/Hispanics, and many individuals with
complex ancestries, despite confident self-identification within their ethnic
group. Researchers therefore often exclude them from the set of people used in
a particular GWAS analyses, on the basis that their unique population histories
can invalidate the statistical models used in these techniques.
This, in turn, can confuse
people who read the studies and observe distinct and seemingly ‘natural’ population
clusters emerge. If they aren’t familiar with the practice of removing these
individuals with more complex ancestries (or don’t read the detailed methods,
which are often tucked away in elusive supplementary sections of a published paper),
they could easily be misled into thinking that the populations in these
analyses are much more distinct than they are in reality. The resulting biases
are poorly understood, and the terminology involved can be confusing to
non-specialists. Furthermore, while it is clear to GWAS researchers that the
results of their analyses tend to be specific to the population studied and
their predictions cannot be reliably extended to other populations with very
different ancestry, this is not widely recognised or understood by
non-specialists.
When it comes to a trait as
complex as cognitive abilities, there is nothing genetically unusual or special
about measures of intelligence such as IQ. Just like other complex traits
discussed above (such as height or disease susceptibility) measures of
cognitive ability are related to thousands of different genetic variants, each
of which may play small but significant roles in brain development and
function, or any number of other biological processes that are involved in a
person’s cognitive abilities.
IQ scores are heritable:
that is, within populations, genetic variation is related to variation
in the trait. But a fundamental truism about heritability is that it tells us
nothing about differences between groups.
Even analyses that have tried to calculate the proportion of the difference
between people in different countries for a much more straightforward trait (height)
have faced scientific criticisms. Simply put, nobody has yet developed
techniques that can bypass the genetic clustering and removal of people that do
not fit the statistical model mentioned above, while simultaneously taking into
account all the differences in language, income, nutrition, education,
environment, and culture that may themselves be the cause of differences in any
trait observed between different groups. This applies to any trait you could
care to look at – height, specific behaviours, disease susceptibility,
intelligence.
Not only that, the genetic
knowledge we gain from studying our mainly-European pools of participants
becomes highly unreliable when it is applied to those with different ancestries.
Although it is a common trope to argue that we will have the answer to the
question of the genetic basis of group differences in traits “in the next five
years”, or “in the next decade”, the advances in genomics reveal that the
question is far more complex than we could have imagined, even just a few years
ago. Consequently, anyone who tells you that there’s good evidence on how much
genetics explain group differences (rather than individual differences) is fooling
you – or fooling themselves.
However, there are some strong
hints towards the answer. The genetic variants that are most strongly
associated with IQ in Europeans are no more population-specific than any other
trait. To put it bluntly, the same genetic variants associated with purportedly
higher IQ in Europeans are also present in Africans, and have not emerged, or
been obviously selected for, in recent evolutionary history outside Africa.
Moreover, since it is a complex trait, the genetic variation related to IQ is
broadly distributed across the genome, rather than being clustered around a few
spots, as is the nature of the variation responsible for skin pigmentation.
These very different patterns for these two traits mean that the genes
responsible for determining skin pigmentation cannot be meaningfully associated
with the genes currently known to be linked to IQ. These observations alone
rule out some of the cruder racial narratives about the genetics of
intelligence: it is virtually inconceivable that the primary determinant of
racial categories – that is skin colour – is strongly associated with the
genetic architecture that relates to intelligence.
Finally, multiple lines of
evidence indicate that there are complex environmental effects (as might
reasonably be expected) on measures of IQ and educational attainment. Many
socioeconomic and cultural factors are entangled with ancestry in the countries
where these studies are often performed – particularly in the USA, where
structural racism has historically and continues to hugely contribute to
economic and social disparities. We cannot use populations in these countries to
help answer the question of why IQ scores are claimed to be lower in other
countries with entirely different social, economic, and cultural histories, nor
to answer the role of genetics for alleged differences in IQ measures between groups
inside a country with strong societal differences linked to ancestry (for
example, the USA). Thus, confident assertions that current GWAS show us that
‘race’ is associated with cognitive function are simply wrong. It is our
contention that any apparent population differences in IQ scores are more
easily explained by cultural and environmental factors than they are by genetics.
This argument is bolstered
by the observed increase in average IQs over time known as the Flynn Effect.
The political scientist James Flynn observed that IQ was rising in test groups
on average by around three points per decade from the 1930s onwards. Factors
that account for this include improved health, nutrition, standard of living
and education, but changes in genes can be ruled out. Because the effect is
seen in many places around the globe, and has been observed in just a few
years, substantive genetic changes cannot have occurred either within or
between generations. If, for example, the Flynn Effect had not occurred in the
Netherlands, then the current average IQ there would currently be as it was in
the 1950s, that is, around 80. A
plausible argument for the putative lower average IQ score in some Sub-Saharan
African countries is that the socio-economic factors behind the Flynn Effect
have not transpired there. If this is indeed the case, or if other factors
explain observed differences in IQ, we believe that explanations relying on
genetic differences between populations are fundamentally unsound.
Conclusion
The advent of new tools and
an enormous surge in genetics research all over the world has inadvertently
revitalised a vocal fringe of race pseudoscience, much of which appeals to our
social experience of the people of the world, and the very real, but socially
determined races as we describe them colloquially. These novel scientific techniques
are complex and sophisticated, and therefore susceptible to misinterpretation
and misplaced use. It is incumbent upon scientists to understand and help
explain the validity of these tools to other scientists, to journalists and to the
wider public. By understanding both our history
and contemporary research, we are emboldened by knowing that genetics has only
served to undermine its own racist history.
Ewan Birney
European Molecular Biology Laboratory, European
Bioinformatics Institute
Jennifer Raff
Department of Anthropology, University of Kansas.
Adam Rutherford
Genetics,
Evolution & Environment, University College London
Aylwyn Scally
Department of Genetics, University
of Cambridge