The novel gene dilemma

N-of-1. The use of whole exome sequencing has led to many of the recent genes discovered in the epilepsy field. However, in contrast to established genes or emerging genes that are found in several patients, there is a significant proportion of patients who carry de novo mutations in novel genes. In many cases, these novel genes look very suspicious. One aspect of a recent publication in Genetics in Medicine was to assess how these suspicious candidates convert to established genes over time.

This figure demonstrates the time frame between the diagnostic exome report and an additional independent report for eight genes that were initially reported out as possible or probably positive novel genes. Three of these genes (COQ4, DNM1, PURA) have been independently confirmed as disease genes through peer-reviewed studies. Alterations in the other genes were found in at least one additional patient. This data should provide us with some estimate on the frequency of speed of how novel candidate genes convert into disease genes. In summary, we postulate a 20% annual conversion rate for good candidate genes.

Gene discovery. This figure demonstrates the time frame between the diagnostic exome report and an additional independent report for eight genes that were initially reported out as possible or probably positive novel genes. Three of these genes (COQ4, DNM1, PURA) have been independently confirmed as disease genes through peer-reviewed studies. Alterations in the other genes were found in at least one additional patient. This data should provide us with some estimate on the frequency of speed of how novel candidate genes convert into disease genes. In summary, we postulate a 20% annual conversion rate for good candidate genes.

Disclaimer. This publication was conceptualized and written by my wife Katie and is her first major publication in her role as an exome genetic counselor and analyst at Ambry Genetics, a diagnostic laboratory which performs exome sequencing as one of their services. During the last 18 months, Katie, as part of the Ambry exome team, has crunched dozens and dozens of diagnostic epilepsy exomes. One feature that we have capitalized on during this publication is the fact that the diagnostic exome report also includes an assessment of novel genes going back as far as 2012. As most diagnostic reports do not make it into the scientific literature, this allowed us to see how genes reported out a possible novel candidate became disease genes during the last 3-4 years. In summary, our data suggest that these novel candidates have a good chance of becoming established disease genes within the next 12 months or to be at least identified in a second patient.

Overall framework of the study. The novel gene assessment was only one aspect of our overall study. Basically, Katie’s study assessed the usefulness of diagnostic exome sequencing in patients with epilepsy compared to other disorders. We were able to include over 300 patients with epilepsy in this study and most patients received trio exome sequencing. The study showed a 40% rate of positive findings in epilepsy patients, a 10% excess compared to patients without epilepsy. Patients with epileptic encephalopathies and onset in the first year of life had the highest rate of positive findings. This rate is comparable to what has been published in the field, but the finding that epilepsy is actually the phenotype with the highest rate of positive exome findings is a novel aspect. Another interesting finding related to the inheritance models in exome-positive cases. While the majority of positive findings were due to de novo mutations (~75%), recessive inheritance was found in 20% of patients and X-linked inheritance in ~5% of patients. While there may be some bias in this data given the referral-based nature of this cohort, this helps put our expectation of positive exome results into perspective. We have included the full list of all identified mutations in the Supplement of our publication.

Most common genes. It is also interesting to review the genes that were most commonly found in our study. In descending order, these genes were KCNQ2 and MECP2, which were found in four patients and FOXG1, IQSEC2, KMT2A, and STXBP1, which were found in three patients each. You may realize the notable absence of SCN1A in this list, which may indicate that, on a clinical level, most patients with Dravet Syndrome may be identified prior to proceeding to exome analysis. In most cases, these patients were probably picked up by gene panel analysis. The same also probably applies to SCN2A and CDKL5. This distorted gene frequency is indicative of the fact that exome sequencing is usually not a first-line test. KMT2A is a gene that I was not familiar with previously. De novo truncating mutations in KMT2A cause Wiedemann Steiner Syndrome, which is characterized by intellectual disability, short stature and relatively mild dysmorphic features, which typically are becoming more prominent over time. As many children with Wiedemann Steiner Syndrome also have early-onset epilepsy, this gene was discovered in multiple patients in our study.

Novel genes. In addition to established genes, almost 10% of patients with epilepsy had suspiciously looking mutations in genes that were novel at the time. Using a framework to assess gene function, pathways, expression data, and known phenotypes in animal models, these candidate genes were reported as either possibly positive (7 genes) or likely positive (14 genes). Independent of the initial report, three (at the time) novel genes including DNM1, PURA, and COQ4 were confirmed as disease genes in the following year, entirely independent of the initial diagnostic report. For five additional genes including SNAP25, SV2A, SON, CLTC, and OGT, further patients were described in the following 12 months. In most cases, these additional patients stem from the DDD study, indicating the necessity for data sharing in the field.

What you need to know. Novel gene discovery through diagnostic exome sequencing is an ongoing challenge. What we tried to do with our study was to document how candidate genes become disease genes over time, trying to establish a reference framework on what we can expect in the future. The numbers of novel genes in our study were small, but we would like to suggest the following hypothesis: if a novel candidate gene is identified through exome sequencing and this gene looks suspicious based on existing data, there is almost a 50% chance that this gene will be found in other patients in the next 12 months and a 20% chance that this gene will become an established disease gene during this time.

 

Ingo Helbig is a child neurologist and epilepsy genetics researcher working at the Children’s Hospital of Philadelphia (CHOP), USA. He also leads the epilepsy genetics group at the University of Kiel, Germany.

Facebook Twitter