How to find recessive disease genes for epileptic encephalopathies

The E2 story continues. There has been major progress in identifying the role of de novo mutations in infantile spasms and other epileptic encephalopathies. Over the last two years, more than 20 new genes for epileptic encephalopathies were discovered and we have good evidence suggesting that de novo mutations play a major role in these disorders. Moreover, we have gotten a good sense on how complicated it can be to call a de novo mutation pathogenic given the flood of rare genetic variants in the human genome. However, de novo mutations are not what we think about clinically when assessing a patient with new-onset epileptic encephalopathy. In a clinical setting, we are often concerned about underlying metabolic disorders, many of which are recessive. Accordingly, we felt that the next task of the E2 consortium was to assess the role of inherited variants in epileptic encephalopathies. Just to tell you in advance, it is not as easy as it sounds.

The flood of recessive variants. Before we tell you about the next project of the E2 working group, let me give you a brief overview of why analysis of inherited variants is complicated. In previous posts, we discussed the huge number of rare genetic variants in the human genome. We are currently in the middle of an enormous paradigm shift when it comes to assessing the pathogenicity of variants in the human genome. There is so much variability in the human genome that we have to be really, really sure to call a change pathogenic. This is true for rare variants, de novo mutations and copy number variations. Genes affected by recessive or compound heterozygous variants, however, were implicitly felt to be exempt from this change of thinking. There is a growing number of publications describing a novel recessive gene in single patients. I would like to argue that we need to approach recessive epilepsy genes with the same level of scrutiny. And here are some examples why.

Watson and Cockayne. When Nobel Prize laureate James Watson was the first individual in the world to be sequenced with next generation sequencing methods, he was found to have a deleterious recessive mutation in ERCC6, the gene for Cockayne syndrome, a severe neurodegenerative disorder. It was obvious that James Watson was not affected by Cockayne syndrome, raising some initial suspicion that some of the variants in the literature might be false positives. This suspicion has been confirmed by many more recent studies using population genomics, which find that a significant proportion of the population carries recessive or compound heterozygous mutations that would be expected to result in severe recessive disorders. One example is the Dutch GoNL project, which was recently published and discussed by Dan Koboldt on massgenomics.org. In summary, the mutation spectrum of recessive disorders is unknown and databases are of limited usefulness. Therefore, if we want to assess the role of inherited variants in genetic epilepsies, we better be very, very careful.

The E2 consortium. The E2 consortium is an international collaboration between Epi4K, EPGP, EuroEPINOMICS and DESIRE, connecting the major consortia in the field of epilepsy genetics research. Currently, our analysis will include ~400 sequenced trios. As previously discussed, E2 is currently in the process of putting together a major study on de novo mutations in epileptic encephalopathies in the largest cohort of patients ascertained so far. As the next step, we will assess the role of inherited variants in Infantile Spasms and Lennox-Gastaut Syndrome. Let me outline how we aim to use a three step approach to combine a conservative statistical analysis of inherited variants with an assessment of the frequency of undiagnosed neurometabolic disorders.

The E2 three-step model. When analyzing inherited risk factors in our ~400 exome-sequenced trios with epileptic encephalopathies, we will ask three separate questions. First, we are interesting in genes with associated rare variants. Secondly, we are interested in genes, in which we find evidence for "overtransmission" from both the paternal and maternal side. Finally, we will screen the E2 cohort for treatable metabolic disorders, many of which may masquerade as IS/LGS

The E2 three-step model. When analyzing inherited risk factors in our ~400 exome-sequenced trios with epileptic encephalopathies, we will ask three separate questions. First, we are interesting in genes with associated rare variants. Secondly, we are interested in genes, in which we find evidence for “overtransmission” from both the paternal and maternal side. Finally, we will screen the E2 cohort for treatable metabolic disorders, many of which may masquerade as IS/LGS

Step 1 – rare variant TDT. In a first assessment, we will ask the question if there are genes in which rare variants are overtransmitted from parents to the affected children. This analysis follows the basic idea of a transmission dysequilibrium test (TDT) and is basically a trio-based association study. If rare variants in a given gene are associated with IS/LGS they will be “overtransmitted” from parents to the offspring. In principle, every variant has an a priori chance of 50% to be transmitted from parents to the offspring. If there is an association between the gene and the phenotype, the offspring will statistically “attract” the pathogenic variants. This test will be insufficiently powered for individual variants, but may reveal interesting findings if all rare variants in a gene are combined, i.e. are regarded jointly. The basic concept of a TDT still puzzles me a little, but there is good evidence in the literature that this method works. The particular type of TDT used for the E2 trios also takes into account population frequencies, adding additional power to the analysis.

Step 2 – recessive TDT. The TDT can be expanded to only look at rare variants that are transmitted from both sides – this is actually a very nice tweak to bridge the gap between the association of rare variants with IS/LGS and the assessment of recessive disorders. We’re not quite sure what we will find when we use this analysis and how powerful this method will be, but it is an interesting concept to confront the issue of genomic noise head on. An analysis like this has never been done before and will be an interesting template for analyses in large autism and intellectual disability datasets, as well.

Step 3 – known neurometabolic disorders. Finally, we will take on the task to screen the E2 cohort for known neurometabolic disorders. This is actually easier said then done given that there is not a common consensus on which and how many neurometabolic disorders may masquerade as genetic IS/LGS. All neurometabolic disorders are extremely rare and the a priori probability to identify cases with hidden neurometabolic disorders is very low. Given the overall lack of consensus of a gene list, we decided to use a validated list for potentially treatable metabolic disorders that was recently published. This list includes a list of ~80 genetic disorders and includes many of the known neurometabolic disorders that we usually aim to exclude clinically in patients with new-onset epileptic encephalopathies. This assessment will give us a good estimate of the frequency of hidden, potentially treatable metabolic disorders that present as epileptic encephalopathies.

This is what you need to know. The next step of the E2 analysis will address the role of inherited variants in epileptic encephalopathies in a comprehensive and novel way. By assessing various types of underlying models, we are in the unique position to answer this next, important question, pushing the analysis of trio exome data to the next level.

 

Ingo Helbig

Child Neurology Fellow and epilepsy genetics researcher at the Children’s Hospital of Philadelphia (CHOP), USA and Department of Neuropediatrics, Kiel, Germany

Facebook Twitter