Why using genetic risk scores on embryos is wrong

Due to the unwise announcement of using polygenic risk score selection embryo selection by Genomic Prediction, a US company, and its exposure in articles such as this by the Economist, the wisdom of selecting embryos by polygenic risk scores has become a concrete question. I do not think it is medically justified, nor scientifically sound, nor do I think it is ethically wise. In this blog post I justify this position.

Despite my disquiet and skepticism on this usage, I do support the long standing use of embryo selection via pre-implantation genetic diagnosis (PGD) of embryos for selected rare diseases, and I also support active research and development of polygenic risk scores in adult screening. However, I think the combination of these two technologies is wrong, though without saying it is clearly wrong for all time.

A brief reminder. Preimplantation genetic diagnosis involves making a set in-vitro fertilized human embryos from prospective parent’s egg and sperm, and extract a single cell from each embryo. The removal of a single cell at the right stage does not affect the embryo’s health. The single extracted cell has a full genome, and so can be used to assess the genome of this particular embryo. In the case of a rare recessive disease such as Cystic Fibrosis for example, where if both parents are heterozygous carriers, 1 in 4 of the embryos will go onto develop the disease. Here PGD can test a series of embryos formed from the parent’s egg and sperm, and only embryos which do not have the disease are implanted into the mother.

For Polygenic risk scores (PRS), a genome-wide model of the genetic component of a trait (for example, heart disease risk or breast cancer risk) is developed by looking at hundreds of thousands of individuals with (partially) known outcomes and a high density (though not necessarily full genome) sequence. These statistical models, once validated and tested, can be used to provide a risk of new individuals potentially before outcomes (if they fit the baseline criteria of the model; usually due to the population they come from). Both the research and the testing of this technologies is on adults, and the most near term areas being explored are existing population wide screening, such as heart attack and breast cancer risk.

So – if one is comfortable with both of these technologies, then the argument is that combining them to choose embryos (children) who have lower “bad” risk (for example, lower risk of heart disease) could be a good thing. As I mentioned at the start, I disagree on safety, science and ethical grounds.

The first is a straightforward safety aspect. Polygenic risk scores are currently being discussed in a screening process, where the polygenic score either changes the resources assigned to different individuals (for example, bringing forward breast cancer screening visits for a subset of the population) or adding to the overall threshold for clinical intervention along with other factors (for example, recommending statins along with other factors such as cholesterol, age and weight). These interventions are relatively low risk (though certainly not 0 due to the real danger of overdiagnosis in the first case and adverse drug reactions in the second) and the genetic information is blended with other factors. Using polygenic risk scores alone for embryo selection is a large intervention for a uncertain gain (see below) with likely other unintended consequences. For example, selecting embryos on the basis of PRS scores is likely to increase the chance of recent homozygous haplotypes, and any deleterious recessive mutation will increase. Both the fundamental procedure of PGD is risky and the PRS approach will provide more, untested risk for the children.

The second is the soundness of the science. Polygenic risk scores are statistical models of the genetic component of the variation in traits in individuals observed in the population; they are not models of which bits of the limited possibilities of a genomes will impact a trait given particular parents. Of course, these two models are expected to be related, but they are not the same. Most trivially, in the case of embryo selection one is constrained by the haplotypes of the parents. This sounds like a small change but it is not; on a population scale for example, the underlying linear model of alleles at a locus is a reasonable approximation to the truth of often more complex recessive/dominant behaviour, but when one is constrained to just four possibilities per locus, this local non linear behaviour will almost certainly have an effect, and potentially a substantial one. This is the science-modelling inverse of the safety issue I raised above. It is worth noting the big gains in animal breeding genetics use the genetics almost exclusively for mate selection (for example which bull’s semen is used to impregnate which cow), and very rarely for embryo selection (the only cases I know is in the context of deliberately genetic engineered cases, which is a very different, and in anycase, animal breeding is clearly ethically different than human embryo selection). Furthermore the genome is a very big place, and there are very many traits (risks for different diseases etc) which one might want to “optimise” for. However, the big size of the genome coupled with the very many places on the genome means that the underlying model drives towards uncorrelated trait distributions; this means as the number of traits one wants to simultaneously optimise on so the number of embryos one needs to select from increases.

Finally this is unlikely to be ethical. The ethics of these procedures needs to be determined by a societal process, not a scientific process; ethical discussions should be informed by the science, but no scientist has a privileged position in the discussion. So, now speaking a citizen, I think it is mistake in society to place such fundamental choices of attributes of the children into the hands of parents without an extremely good rationale. Although parents make many choices on behalf of their children, starting with the desire to have a child, and their upbringing, to deliberately select an unchangeable genetic feature for their child from birth must pass the highest level of scrutiny and societal acceptance. In addition there are seductive sounding traits, such as educational performance, or facial symmetry, or hair or skin colour which are (broadly) as ammeanable to genetic analysis as heart disease risk or breast cancer risk – to have parents try to weigh both the science but also the fundamental rights or wrongs of what to choose as unchangeable genetic components is, in my view, fundamentally changing the relationship between parents and children.

These are the reasons I strongly believe this current procedure is not safe, not scientifically sound and not ethical. But could such a procedure sometime in the future meet these criteria? How would one assess it? Certainly I can imagine in the future more research that extends the definition of severe genetic disease that one could test from one locus to multiple loci, and it is possible that considerations of genome-wide “background” should be used. This is in some sense extending from the rare disease cases now. This is quite different in the specifics from the current PRS models, and I would be arguing for considerable research on (for example) local non-linear interactions (recessive / dominance) in this modelling. In these scenarios one could be more certain of the severe consequences of certain genotype combinations, and formally at this point I don’t see this to being ethically fundamentally different from current practice on rare disease. However, we are a long way from this, and the broad, population wide views of genetic risk is not fit for purpose for this individual risk. Furthermore this will correctly bring up the more complex “slippery slope” arguments towards broad trait selection.

How does one assess all of this, in particular the ethical dimension? Here I am a big proponent of the UK’s Human Fertilization and Embryo Authority, which is how this is regulated in the UK. A procedure has to pass medical, scientific and ethical bars, and these are considered carefully with evidence and discussion. Critically the ethics is a societal process informed by science but without scientists taking a privileged position. In contrast, the mainly self-regulation of the medical practice in the US I think is dangerous (though I am less of an expert on US regulation).

Until there is good regulation of this technology worldwide I believe we will have more of these stories and more unsavoury applications.

Race, genetics and pseudoscience: an explainer

Ewan Birney, Jennifer Raff, Adam Rutherford, Aylwyn Scally

Human genetics tells us about the similarities and differences between people – in our physical and psychological traits, and in our susceptibility to disorders and diseases – but our DNA can also reveal the broader story of our evolution, ancestry and history. Genetics is a new scientific field, relatively speaking, merely a century old. Over the last two decades, the pace of discovery has accelerated dramatically, with exciting new findings appearing daily. Even for scientists who study this field, it’s difficult to keep up.

Amidst this ongoing surge of new information, there are darker currents. A small number of researchers, mostly well outside of the scientific mainstream, have seized upon some of the new findings and methods in human genetics, and are part of a social-media cottage-industry that disseminates and amplifies low-quality or distorted science, sometimes in the form of scientific papers, sometimes as internet memes – under the guise of euphemisms such as ‘race realism’ or ‘human biodiversity’. Their arguments, which focus on racial groupings and often on the alleged genetically-based intelligence differences between them, have the semblance of science, with technical-seeming tables, graphs, and charts. But they’re misleading in several important ways. The aim of this article is to provide an accessible guide for scientists, journalists, and the general public for understanding, criticising and pushing back against these arguments.

Human population structure is not race

Racial categories, as most people understand them today, have some of their roots in the development of scientific thinking during only the last few centuries. As Europeans explored and colonised the world, thinkers, philosophers and scientists from those countries attempted to apply taxonomic structures to the people that they encountered, and though these attempts were many and varied, they typically reflected sharp geographic boundaries, and obvious physical characteristics, such as pigmentation and basic morphology – that is to say, what people look like. Research in the 20th century found that the crude categorisations used colloquially (black, white, East Asian etc.) were not reflected in actual patterns of genetic variation, meaning that differences and similarities in DNA between people did not perfectly match the traditional racial terms. The conclusion drawn from this observation is that race is therefore a socially constructed system, where we effectively agree on these terms, rather than their existing as essential or objective biological categories.

Some people claim that the exquisitely detailed picture of human variation that we can now obtain by sequencing whole genomes contradicts this. Recent studies, they argue, actually show that the old notions of races as biological categories were basically correct in the first place. As evidence for this they often point to the images produced by analyses in studies that seem to show natural clustering of humans into broadly continental groups based on their DNA. But these claims misinterpret and misrepresent the methods and results of this type of research. Populations do show both genetic and physical differences, but the analyses that are cited as evidence for the concept of race as a biological category actually undermine it.

Even though geography has been an important influence on human evolution, and geographical landmasses broadly align with the folk taxonomies of race, patterns of human genetic variation are much more complex, and reflect the long demographic history of humankind. This begins with our origin as a species – Homo sapiens – in Africa within the last quarter of a million years or so, and is then shaped by our continual mixing and movement throughout the world that began within the last 80,000 years. This history means that the greatest amount of genetic diversity – the oldest splits in the human genealogical ‘tree’ – are found within Africa. If an alien, arriving on Earth with no knowledge of our social history, wished to categorise human ancestry purely on the basis of genetic data, they would find that any consistent scheme must include many distinct groups within Africa that are just as different from each other as Africans are to non-Africans. And they would find it difficult to identify any natural or obvious subdivision of people into groups which accurately partitions human genetic variation due to the constant migrations of people across the world. 

Furthermore, there isn’t really a human ‘tree’. Although we use this arboreal metaphor to describe ancestry and evolutionary relationships, the true structure of human ancestry is far more convoluted. Human populations have continued to diverge, expand and interact throughout the last 100,000 years, resulting in a continuously branching and looping ancestral structure: the real history of Homo sapiens is more like an overgrown thicket than a stately branching tree. Much of the population structure that we see today in ancestry testing results dates back only to a few thousand years or less. For example, the majority of European genomes are a mixture of at least three major groups within the last 10,000 years: the early hunter-gatherers who first populated the continent, a second wave of ancestry from the Near East associated with the spread of farming; and a third contribution from north Eurasia during the Bronze Age (2000–500 BCE).

Geneticists use a variety of tools to visualise the subtle and complex patterns of genetic variation between people, and to mathematically cluster them together based on relatedness. Such methods are helpful for exploring data, but have also been the source of wider confusion. For example, Principal Component Analysis (PCA) plots often show distinct, colourful clusters of dots that appear to separate groups of people from different parts of the world. In some cases, these clusters even seem to correspond to traditional racial groupings (e.g. ‘Africans’, ‘Europeans’ and ‘Asians’). It is images such as these which are often deployed as genetic evidence for the existence of separate races. But these methods can be misleading in ways which non-experts – and even some specialists – are unaware of. For example, some of the observed genetic clustering is a reflection of the samples that were included in the study and how they were collected, rather than any inherent genetic structure. DNA sample collection typically follows existing cultural, anthropological or political groupings. If samples are collected based on pre-defined groupings, it’s entirely unsurprising that the analyses of these samples will return results that identify such groupings. This does not tell us that such taxonomies are inherent in human biology.

Some ‘human biodiversity’ proponents concede that traditional notions of race are refuted by genetic data, but argue that the complex patterns of ancestry we do find should in effect be regarded as an updated form of ‘race’. However, for geneticists, other biologists and anthropologists who study this complexity, ‘race’ is simply not a useful or accurate term, given its clear and long-established implication of natural subdivisions. Repurposing it to describe human ancestry and genetic structure in general is misleading and disingenuous. The term ‘population’ is used in many contexts within the modern scientific literature to refer to groups of individuals, but it is not merely a more socially acceptable euphemism for race.

It is often suggested that geneticists who emphasise the biological invalidity of race are under the thumb of political correctness, forced to suppress their real opinions in order to maintain their positions in the academy. Such accusations are unfounded and betray a lack of understanding of what motivates science. Discoveries, particularly in biology, have often been challenging or difficult for society to accept, and scientists throughout history are celebrated for establishing them in the face of contemporary objections. Indeed, the biological invalidity of traditional racial categories runs counter to many people’s lived experience, and is in itself a morally neutral conclusion. If the evidence is sound, scientific integrity demands that it is published. The charge that thousands of scientists across the world are covering up a real discovery for fear of personal or wider social consequences is absurd. Furthermore, it is important to distinguish understanding the world around us using science, from the rules, distribution of funds and policies in society. The goal of scientists is to provide that understanding. At the same time, we appreciate that societies determine their principles and policies informed by, but independent of science.

Traits, IQ and genetic diversity

Traits and characteristics vary among individuals within and between different parts of the world, sometimes in ways which are visible, such as with height or pigmentation, and sometimes in other more cryptic ways, such as with disease susceptibility. Understanding how genomes influence traits is a major aspect of genetic research.

There are countless traits one can measure in humans, but none more controversial than those associated with intelligence, such as IQ. ‘Human biodiversity’ proponents tend to fixate on IQ, and one can speculate about why this is and what conclusions they wish to draw; however, it should be noted that IQ itself is a valid and measurable trait. Critics often assert that it is an oversimplified metric applied to a far-too-complex set of behaviours, that the cultural-specificity of tests renders them useless, or that IQ tests really only measure how good people are at doing IQ tests. Although an IQ score is far from a perfect measure, it does an excellent job of correlating with, and predicting, many educational, occupational, and health-related outcomes. IQ does not tell us everything that anyone could want to know about human intelligence – but because definitions of “intelligence” vary so widely, no measure could possibly meet that challenge.

‘Human biodiversity’ proponents sometimes assert that alleged differences in the mean value of IQ when measured in different populations – such as the claim that IQ in some sub-Saharan African countries is measurably lower than in European countries – are caused by genetic variation, and thus are inherent. The purported genetic differences involved are usually attributed to recent natural selection and adaptation to different environments or conditions. Often there are associated stories about the causes of this selection, for example that early humans outside Africa faced a more challenging struggle for survival, or that via historical persecution and restriction of professional endeavours, Ashkenazi Jews harbour genes selected for intellectual and financial success.

Such tales, and the claims about the genetic basis for population differences, are not scientifically supported. In reality for most traits, including IQ, it is not only unclear that genetic variation explains differences between populations, it is also unlikely. To understand why requires a bit of background.

It is certainly the case that some traits are the result of local or regional adaptation, corresponding to differences in particular genes. Indeed, one of the reasons for humankind’s success as a global species is local adaptation. The majority of this adaptation is via behaviour and the cultural transmission of successful behaviours, but there are also cases where the adaptation is genetic, that is, small modifications occur within our genomes that enhanced survival in different environments. For example, genetic changes have meant that coastal populations have DNA variants that help them more readily process diets that are rich in oily fish; pastoralist farmers all over the world evolved the ability to metabolise milk after weaning, largely through genes that continued to produce a particular enzyme into adulthood that would otherwise be switched off by the age of five. Lighter skin evolved to allow more sunlight, and thus Vitamin D synthesis, into our bodies as we migrated away from the equator. We can see these local adaptations in our DNA. But they only hold for a minority of traits. Most traits have very real genetic and physical differences between individuals, but any group differences do not correspond to traditional race categories such as height, or the susceptibility to type 2 diabetes in an environment with ready access to food.

For traits caused by regional adaptation, contemporary genetic techniques now allow us to see clear evidence for recent selection on new genetic variants or patterns at particular locations in the genome. However, such cases are atypical: most traits have no obvious or localised signal of recent selection. The lack of regional adaptation does not hinder genetic approaches, and all traits (whether under recent adaptive selection or not) can be studied by analysing large numbers of people. The Genome-Wide Association Study (GWAS) is a powerful tool for finding genetic variants associated with all sorts of human traits. GWAS researchers take a group of people with differing values or levels of a trait of interest, and scan their whole genomes to look for specific sections of DNA where their genetic variation correlates with their variation in the trait. For most traits, the GWAS results are complicated. Unlike in more straightforward cases like Sickle Cell Anaemia, where you’d find a big spike of statistical significance in one particular gene (the beta-globin gene, whose variation is the primary cause of the disease), GWAS results typically implicate many thousands of positions in the genome that, in aggregate, build towards the probability of having a disease or some level of a particular trait. And so, for height, or heart disease, or schizophrenia or other complex conditions, we see many small spikes of significance dotted around the genome – so many that we can’t single out individual genes or sections of DNA that sometimes get characterised as “the gene for” that particular outcome. Each of the large number of places across the genome which we associate with a trait contribute a small amount, but collectively the sum of all these effects means that there is in aggregate a substantial genetic influence on how the trait varies between people.

However, GWAS and other similar approaches are affected by population structure, and hence face the same issues of dependence on sampling and confounding with cultural factors mentioned above. Most GWAS approaches have been carried out in populations sampled from across Europe, and have ancestries consistent with this sampling. In many cases though, only certain subsets of people are included in these analyses – for good scientific reasons. For example, samples of “European” populations used in genetic studies often have excluded up to as many as 30% of self-identified Europeans. This is because some individuals introduce hard-to-model complications into the data, forming distinct sub-clusters or complicating the genetic model. For example, Finns and Sardinians are often excluded as they have quite distinct genetic ancestries compared to many other Europeans, as are some people in India, north Africa, Latino/Hispanics, and many individuals with complex ancestries, despite confident self-identification within their ethnic group. Researchers therefore often exclude them from the set of people used in a particular GWAS analyses, on the basis that their unique population histories can invalidate the statistical models used in these techniques.

This, in turn, can confuse people who read the studies and observe distinct and seemingly ‘natural’ population clusters emerge. If they aren’t familiar with the practice of removing these individuals with more complex ancestries (or don’t read the detailed methods, which are often tucked away in elusive supplementary sections of a published paper), they could easily be misled into thinking that the populations in these analyses are much more distinct than they are in reality. The resulting biases are poorly understood, and the terminology involved can be confusing to non-specialists. Furthermore, while it is clear to GWAS researchers that the results of their analyses tend to be specific to the population studied and their predictions cannot be reliably extended to other populations with very different ancestry, this is not widely recognised or understood by non-specialists.

When it comes to a trait as complex as cognitive abilities, there is nothing genetically unusual or special about measures of intelligence such as IQ. Just like other complex traits discussed above (such as height or disease susceptibility) measures of cognitive ability are related to thousands of different genetic variants, each of which may play small but significant roles in brain development and function, or any number of other biological processes that are involved in a person’s cognitive abilities.

IQ scores are heritable: that is, within populations, genetic variation is related to variation in the trait. But a fundamental truism about heritability is that it tells us nothing about differences between groups. Even analyses that have tried to calculate the proportion of the difference between people in different countries for a much more straightforward trait (height) have faced scientific criticisms. Simply put, nobody has yet developed techniques that can bypass the genetic clustering and removal of people that do not fit the statistical model mentioned above, while simultaneously taking into account all the differences in language, income, nutrition, education, environment, and culture that may themselves be the cause of differences in any trait observed between different groups. This applies to any trait you could care to look at – height, specific behaviours, disease susceptibility, intelligence.

Not only that, the genetic knowledge we gain from studying our mainly-European pools of participants becomes highly unreliable when it is applied to those with different ancestries. Although it is a common trope to argue that we will have the answer to the question of the genetic basis of group differences in traits “in the next five years”, or “in the next decade”, the advances in genomics reveal that the question is far more complex than we could have imagined, even just a few years ago. Consequently, anyone who tells you that there’s good evidence on how much genetics explain group differences (rather than individual differences) is fooling you – or fooling themselves.

However, there are some strong hints towards the answer. The genetic variants that are most strongly associated with IQ in Europeans are no more population-specific than any other trait. To put it bluntly, the same genetic variants associated with purportedly higher IQ in Europeans are also present in Africans, and have not emerged, or been obviously selected for, in recent evolutionary history outside Africa. Moreover, since it is a complex trait, the genetic variation related to IQ is broadly distributed across the genome, rather than being clustered around a few spots, as is the nature of the variation responsible for skin pigmentation. These very different patterns for these two traits mean that the genes responsible for determining skin pigmentation cannot be meaningfully associated with the genes currently known to be linked to IQ. These observations alone rule out some of the cruder racial narratives about the genetics of intelligence: it is virtually inconceivable that the primary determinant of racial categories – that is skin colour – is strongly associated with the genetic architecture that relates to intelligence. 

Finally, multiple lines of evidence indicate that there are complex environmental effects (as might reasonably be expected) on measures of IQ and educational attainment. Many socioeconomic and cultural factors are entangled with ancestry in the countries where these studies are often performed – particularly in the USA, where structural racism has historically and continues to hugely contribute to economic and social disparities. We cannot use populations in these countries to help answer the question of why IQ scores are claimed to be lower in other countries with entirely different social, economic, and cultural histories, nor to answer the role of genetics for alleged differences in IQ measures between groups inside a country with strong societal differences linked to ancestry (for example, the USA). Thus, confident assertions that current GWAS show us that ‘race’ is associated with cognitive function are simply wrong. It is our contention that any apparent population differences in IQ scores are more easily explained by cultural and environmental factors than they are by genetics.

This argument is bolstered by the observed increase in average IQs over time known as the Flynn Effect. The political scientist James Flynn observed that IQ was rising in test groups on average by around three points per decade from the 1930s onwards. Factors that account for this include improved health, nutrition, standard of living and education, but changes in genes can be ruled out. Because the effect is seen in many places around the globe, and has been observed in just a few years, substantive genetic changes cannot have occurred either within or between generations. If, for example, the Flynn Effect had not occurred in the Netherlands, then the current average IQ there would currently be as it was in the 1950s, that is, around 80.  A plausible argument for the putative lower average IQ score in some Sub-Saharan African countries is that the socio-economic factors behind the Flynn Effect have not transpired there. If this is indeed the case, or if other factors explain observed differences in IQ, we believe that explanations relying on genetic differences between populations are fundamentally unsound.

Conclusion

The advent of new tools and an enormous surge in genetics research all over the world has inadvertently revitalised a vocal fringe of race pseudoscience, much of which appeals to our social experience of the people of the world, and the very real, but socially determined races as we describe them colloquially. These novel scientific techniques are complex and sophisticated, and therefore susceptible to misinterpretation and misplaced use. It is incumbent upon scientists to understand and help explain the validity of these tools to other scientists, to journalists and to the wider public. By understanding both our history and contemporary research, we are emboldened by knowing that genetics has only served to undermine its own racist history.

Ewan Birney

European Molecular Biology Laboratory, European Bioinformatics Institute

Jennifer Raff

Department of Anthropology, University of Kansas.

Adam Rutherford

Genetics, Evolution & Environment, University College London

Aylwyn Scally

Department of Genetics, University of Cambridge

What do we need to know in the Life Sciences?

Understanding how life works has been a goal of science since its inception. Many scientists do this for intellectual curiosity – the desire to simply understand and know the natural world around us. Other scientists are driven by the application of knowledge to different areas – applications of human health, agriculture, and the care of the environment.

Life is based ultimately on the chemistry of molecules, but shaped by evolution over billions of years into such staggering feats of organisation that the chemistry of these molecules can have these thoughts, transcribe them and share them with you via the chemistry of your molecules. It is quite remarkable. Necessity requires us catalog these molecules and chemistry, and conceptualise them into the key parts of these complex systems we call organisms – and so we have genomes, RNAs, proteins, organelles, cells, tissues, organs, physiology and individual organisms. These organisms interact to gain energy, matter, reproduce and live in ecosystems. Although life is “just” chemistry, the organisation of this chemistry is ultimately about the control of information over time – control such that one can reproduce the similar organisms in the future, and furthermore, control in humans and other selected species of ideas which we can transmit between individuals.

There is a rich vein of philosophy to mine here – what is life? what are the key features of life that distinguish it from other types of chemistry? (I know my colleague Alvis Brazma is writing an excellent book on this). But I want to focus on a different thread – what do we need know? How big is the knowledge we need to have to understand life?

Catalogs and Mechanism

I will take a simplifying and undoubtly only partially correct viewpoint. There are two broad types of knowledge in the life sciences we need – catalogs and mechanisms.  We need catalogs of things – ultimately these are catalogs of atoms in specific (though sometimes hard to define) configurations but we nearly always form higher level reasonably robust concepts that are many times higher in scale; the concept of a “cell” is one such thing to catalog. Cells – membrane bound collections of biological molecules – are clearly a useful and robust object in much of life – immune cells, hepatocytes, fibroblasts, keratinocytes are useful, recognisable pieces of organisation and themselves building blocks for further catalogs. The concepts of life though are not as neat and tidy as the periodic table of elements – how does the early drosophila embryo with its syncetium and gradients map to the concept of a cell? Is the membrane bound sendai virus “a cell” and if it is not, what is the different to the red blood cell? We should be comfortable that in biology there are not tidy edges to our useful concepts – biology is allowed to leverage any aspect of chemisty to its own end – and these frayed edges in the conceptualisation of biology is part of our science – to be celebrated not hidden and certainly not invalidating the core concepts.

            But catalogs by themselves are nowhere near complete; the catalogs do not by themselves tell us how life works. For that we need mechanisms. Mechanisms are how things interact and potentially change. Ultimately this is via chemistry – rerrangements in the configuration of atoms which are thermodynamically favoured. But just as with the catalogs the mechanisms we are talking about require higher level conceptualisation just for us to manage the complexity of life. Consider the action of the ribosome, tRNAs, tRNA transferases and mRNAs. This amazing, elegant collection of molecules follow thermodynamic rules (mainly due to the release of free phosphate driving the thermodynamics) which means that the codons in the mRNA produce a specific protein at near 99.99% accuracy. It is remarkable. In theory it is one massive chemical reaction which one can describe as multi-step chemistry (indeed, some noble people have attempted to get to reasonably complete chemical descriptions). But we have to have a concept for this which abbreivates it – “translation” – and this concept is robust enough that we code the logical consequence of this chemical reaction (mRNA translated to the protein). This logical, conceptual mechanism is so well understood it is instantiated in thousands of pieces of programming code around the world without an explicit link to the underlying chemistry. Just as with the catalog concept, mechanism has frayed ends – selenocystines and frame-shift read through are two such key oddities in translation – non ribosome based synthesis another key oddity; but these oddities do not somehow invalidate the core concept. And mechanism can be very large scale – the migration of cells during development, or the actions of cells and neurons to achieve homeostasis in circulation in a vertebrate, or the interplay between the commensual gut bacteria and the host cells in digestion, or the social behaviour of groups of individuals. All this I place in mechanism.

Having produced this top level taxonomy of knowledge for biology, we can now list out our needed catalogs and needed mechanisms to have mastery of life. I do not claim this list is complete; I do claim this list is necessary.

(final editorial note; this sort of list is … hubris to attempt to write! Some of these fields I am a genuine expert; some I am an onlooker with a professional interest; some I am nothing more than an armchair amateur. I look forward to the inevitable comments which will help improve both the list and the phrasing)

Catalogs

Species. We need a catalog of all living species

Genomes. We need to know at least one instance of the genome of every species on earth.

Genome Products. We need to know, or have the ability to accurately predict, all the products of each genome, including RNA molecules and protein molecules. The catalog should contain all potential post transcriptional (RNA) and post translational (protein) modifications

Genome Regulation. We need to know, or have the ability to accurately predict, all the points where other molecules (often protein or RNA) interact with the genome.

Protein and RNA structure. We need to know, or have the ability to accurately predict, all the atomic configuration of proteins and RNAs which have relatively stable configurations. For unstable configurations we need a useful description of the feasible configurations. These structures should include all assemblies and complexes.

Non-genome encoded molecules. We need to know all the chemicals present in the cell and their modes of production.

Sub cellular structures. We need to have a catalog of sub cellular structures and ways of understanding the distribution of all types of molecules between them.

Cells and tissues. For every species (ideally; more realistically, every species of high interest) we need to know every cell type and at least one feasible configuration of cells into tissues in a living organism (for C. elegans there is only one configuration, remarkably; for many other species one has to have at least one feasible configuration).

Organs and anatomy. For every species (ideally; more realistically, every species of high interest) we need to know how the tissues with their constituent cells form organs and anatomic structures to make an organism.

Neuronal anatomy. Neurons and brain anatomy is different enough with the axons, dendrites, spines and connections to deserve its own set of concepts (listed above) and own catalog of the set and interaction of these concepts.

Idealised Ecosystems. For every ecosystem of interacting species we need to know the types of species, their numbers and idealised position in a manner which is useful to understand the ecosystem (for example, the presence of symbosis or conflict, of prey/predation, of location relative to each other).

Global ecosystem. For the entire planet we need a catalog of ecosystems and their locations, including human created ecosystems, with appropriate models of transitions.

Mechanisms

DNA to RNA. Not merely transcription (well described) but when and where transcription happens. We need to have the mechanism for every RNA product what conditions cause its production.

RNA to Protein. Translation. We need to have the mechanism for the production of proteins from each RNA products

Protein and RNA to 3D structure. The classic “folding problem”. This is looking more tractable than it has done in a while.

Transformation of other molecules. All the transformations, and when they happen of the other molecules in the cell. Basically, metabolism.

Sub cellular trafficking and structure. This is everything from organelle management to the 3D structure in the nucleus.

Cellular decision making. How and why do cells make decisions? Which molecules have to be present and in which configurations for different decisions?

Development. How does each cell come to its final destination and configuration from the fertilized zygote

Tissue decision making. How and why do collections of cells make decisions?

Organ function, decision making and homeostasis. How does each organ operate? How is its function kept in an appropriate stable or responsive manner?

Neuronal behaviour. How do collections of neurons behave to result in decisions. This is a big topic, and I am tempted to split it into low level circuits and larger emergent properties.

Individual behaviour. How do individuals behave (from commenusal bacteria to host interactions to con-specific interactions) in isolated interactions.

Ecosystem behaviour. How do collections of individuals across many different species behave.

For both catalogs and mechanisms, ultimately we will not be able to describe these and just use our brains to remember them – we will need to publish them, share them and, above all, store them appropriately in databases. For the catalogs this is an obvious necessity – humans do not do well at this scale of enumeration and there is little point in trying to know all these things individually (though some of these things are more countable and memorable than one realises – dedicated curators will know a surprisingly large number of genes for example).

The concept of databasing mechanisms is in its infancy. Schemes such as BioModels have the ability to store some of these. Others are published and transmitted in an almost oral history. Some of these are held in specialised structures in model organism databases (for example, the development of C. elegans), but this is more in its infancy.

The sheer complexity of the above list, and its ultimate destination in databases (as well publications to explain the concepts in the databases) shows the task we need to be prepared for over the coming centuries. I started to annotate each one about the level of completeness, but realised that in itself was a complex task, and a task that often can be broken out into a matrix of catalog vs species, and mechanism vs species – just to enumerate this task we need a database! It shows also how key the life science databases are to this endeavour; they are the ultimate point of knowledge and how we will transmit information between researchers and over time – the narrative in papers will augment, educate and explain – but the data and knowledge will be stored, maintained and used from electronic, online, openly accessible databases.

Why embryo selection for polygenic traits is wrong.

This week (May 20th 2019) has seen yet another splash by an American company offering a polygenic trait score on embryos including intelligence. This is wrong on a number of levels; ethically it is wrong to make this decision as an independent laboratory without broad societal buy in; scientifically it is wrong to imagine the ways we assess polygenic traits will translate into safe and effective embryo selection; for the specifics of IQ/Educational attainment trait this trait is so complex this is additionally unwise over and above any concerns.

I would not recommend it either as a member of society or as a genomic scientist. This blog aims to unpack this more.

Ethics

First off it is important to realise that as science progresses in biology – and in particular reproductive biology – we develop the possibility that we can perform actions that as a society we consider wrong. There is nothing new for this; for example, ultrasound scanning allows one to reliably sex foetus early on in pregnancy; however parental choice of sex of the child is either explicitly illegal or implicitly prohibited in most locations. As we learn more about genetics, we will be able to make more sophisticated choices of what we could do, but it is important that we make the decision about what we should do as responsible members of society.

This decision has to be made using processes set up inside each society; in practice this means under national legislation. I am both most familiar with and very comfortable with the UK’s Human Fertilization and Embryo Authority scheme (HFEA). This is a statutory body set up by the UK Parliament, with a variety of lay and religious members, as well as ethicists and scientists. The UK Parliament has made some possible schemes illegal (for example, reproductive cloning) but otherwise provides considerable latitude for the HFEA to make decisions. It is important that this body is has a majority of non scientists, and when the HFEA licenses a procedure the UK can be very confident it is medically safe, scientifically sound and ethically has broad support.

Each country has to arrange their own affairs, but I think there are some principles of best practice. One is that the scientists and clinicians are not self regulating here – it needs societal buy in. The second is that it is near impossible to handle this via national laws – laws are complex to change and near impossible to write with foresight for future science.

Science of polygenic traits

I am a longstanding genomic scientist, and have broad interest across many topics in genomics and genetics. Despite my cautious enthusiasm for using the genetics of polygenic traits in other medical spheres, in particular to potentially augment our understanding of risk of common diseases in adults, I do not think it is appropriate for embryo selection or assessment, certainly not without more research and potentially not for a long time. The main reason is that we have a high potential to cause harm, and only a small potential to mitigate bad outcomes. Stepping back – polygenic traits are traits where multiple places in the genome contribute to a trait (poly = meaning many, and genic meaning genes). This is well established genetic theory and practice since the 1930s (pre-dating the discovery of DNA). However, not only are these methods inexact but we simply do not know what other features are linked to the traits – the most sophisticated models deliberately do not attempt to localise the precise genomic locations to gain more predictive power. In a situation where one is doing something quite novel (selecting embryos from in vitro fertilized embryos) one simply doesn’t know what would happen. For example, it might be for some traits there are strong developmental aspects which mean the polygenic score we select on also contributes to development defects or to other features in an adult we did not anticipate. There is a big difference between scoring an adult who is alive and well, and selecting embryos for implantations. You might think I am being paranoid, but the history of animal breeding has shown many unforseen consequences of mating strategies; for example, selection of fast growing chickens lead at first to socially inept chickens who bullied / fought with each other when grown in flocks. This was recognised and eventually a multi-variate breeding scheme was put in place, but it could only be recognised by actually trying it. Selection of embryos on polygenic scores would be an experiment, and one in which we would have true unknowns; some of those unknowns having the potential to cause serious harm.

Some commentators cite the success of animal breeding schemes using genetics as supporting polygenic trait selection of embryos. This is misguided. Despite these schemes employing genetics (and similar machinery as polygenic risk scores, known as “Breeding Values” in the breeding community), the schemes are not the same as embryo selection from a random cross; animal breeding genetics gets its main benefit from selecting mates for breeding, not on selecting embryos of a random mating. I know of no animal breeding scheme which involves embryo selection for breeding traits (although embryo selection is used in the production of transgenic animals or selection of sex in elite breeding lines, and via this blog post I have learnt more about its use in plant and animal breeding). Furthermore, as discussed above, animal genetic breeding scheme are a cautionary tale of how things can go wrong as well as go right. The difference is that “failed” breeding choices in plant and animal breeding are simply discarded – this is not acceptable for humans. Anyone who is using plant or animal breeding as justification for the success of genetic intervention in humans simply does not understand animal breeding.

A further point which I almost feel it is so obvious it is not worth making, but reading some articles it does seem necessary. The amazing ability to directly edit genomes is of no relevance to this discussion. Polygenic trait “prediction” should perhaps be better stated as “interpolation” as what is happening is that we take an individual’s genome and try to estimate its phenotype using many many previous individuals phenotype and genotype. The most powerful methods to do this deliberately do not model any specific base pair changes (it ends up being statistically more advantageous to do so as our genome moves around broadly in blocks rather than specific bases); even when we try to estimate the precise bases involved at a particular location, the “blocky” nature of human genetics prevents us from ever being sure. So, although we can steadily improve our ability to use genetics for prediction, it is not in the way of using knowing the precise changes to make, and if we ever tried to do this it would, again, be an explicit experiment for polygenic traits (for monogenic or digenic traits with high penetrance alleles there is a different argument; in those cases it is extremely hard to imagine a scenario where currently licensed pre-implantation diagnosis would not work but genome editing would work).

Intelligence and Educational Attainment as polygenic trait

The genetics of intelligence and of educational attainment (how well people do at school) is a very complex topic; nevertheless some real progress has been made in particular over the last 10 years. This blog post is not the place to unpack the complexity of this trait (unsurprisingly … it is complex) nor the validity of the genetics – I recommend work from Stuart Ritchie, Paige Harden, Alex Young, Ian Deary and Robert Plomin as a selection of researchers in this field. My summary for this purposes of this blog post is that the genetics of IQ and educational attainment are real polygenic traits, but they are the sorts of traits one should be particularly careful of thinking about for embryo selection, over and above my generic concern above, and even when one is trying to focus only on “severe intellectual disability” end of the spectrum.

There are a number of scientific reasons why. The first is that these traits are hard to estimate and the non-random environment (the fact that schooling is different in different places even in relatively homogenous environments) coupled with localised genetics means it is hard to know whether one has “scrubbed” out this effect (cryptic population stratification). Again, the potential for selecting against perfectly reasonable embryos (and performing a procedure with risks) for no gain is present. The second is that around one third to a half of the genetic signal for intelligence / educational attainment (depending somewhat on how you construct the statistics) is due to parental environment; because each person’s genetics is also a reasonable good estimator of their parent’s genetics, genetic variants which influence parenting and via this, the child’s IQ/education show up strongly. This is fascinating research (note; the same techniques that find this do not show a strong effect of parental environment on other traits, eg, blood lipids or height) but means that estimating the tails has an additional major complication in trying to isolate the true “within individual effect”. Finally there are complex interactions between some deleterious traits (eg, autism) with educational attainment (there is a weak positive correlation) meaning that this trait in particular is complex to understand.

This is before one gets into the ethical considerations about how one should handle this trait, though if one is focused on the most severe disability end, this is at least justified. However, the obvious (and wrong) snake oil position is to imagine one can rank or significantly select for the top end of a continuous scale. All the problems with the trait are valid each end; more importantly a naive view of this will be that one can select from the top or bottom of the population distribution, whereas the main determinants in embryo selection will be the genetics of the father and the mother – one is bounded by these genetics (formally – a small variation around the mean of father and mother in the models used; in practice rarely goes outside of this expectation).

Summary

It is both my position as citizen in society (in my case, the UK) that one should not use embryo selection for complex trait behaviours and it is my position as genome scientist that this would be scientifically unsound to do so for any trait, in particular for IQ or educational attainment traits. It is worth considering what would be the closest thing to this that I could endorse. In terms of the science I can imagine (and I believe is licensed now) digenic (two locus) selection, and in the future I could imagine oligenic selection. Furthermore I could imagine a case of one or a small number of loci and a polygenic background, where there is differential advising and options to parents with “bad” polygenic backgrounds for a particular disease coupled with some higher effect size loci which could be treated (in effect) as monogenic diseases in their cases. Finally I can see severe behavioural difficulties with strong genetic basis in mono or oligenic as candidates for licensing. But these are growing the scope from the existing practice, and are a very very long way away from generic polygenic trait scoring for behaviours in embryos.