Exome failures. Trio exome sequencing has the huge potential to discover the genetic basis of neurodevelopmental disorders. However, the results are negative for the majority of patients. In a recent study published in Nature, genome sequencing was applied to exome-negative patients with intellectual disability, identifying mutations in coding regions that were previously missed. But are the authors correct in stating that they can explain more than 60% of cases in an unselected cohort?
Trio exome. The exome is incomplete. What is usually referred to as exome sequencing does not cover all coding regions of all genes, but simply the vast majority of it. Often, the first exon is not completely covered and some genes are not covered at all. Exome sequencing, as we indicated in an earlier post, is basically a technology, not the claim that all genes are covered. It is usually assumed that 10% of the coding region is not sufficiently represented in exome data.
Genomes. One possibility to overcome the limitation is to use a more comprehensive technology. Genome sequencing is getting cheaper and more widely available and could possibly fill the gap that the exomes leave behind. If causative mutations are distributed equally throughout the genome, we would assume 10% of additional cases to be explained by genome sequencing. But no single study ever demonstrated this.
The study. Gilissen and collaborators performed genome sequencing on 50 patient-parent trios that were previously included in the study by Ligt and collaborators. In the earlier study, patients had already undergone trio exome sequencing. Also CNV analysis and various candidate genes had been excluded previously. Looking at the exonic data derived from the genomes, what did they find? They could identify a likely causative genetic mutation in 20 patients, raising the overall discovery rate in their patient cohort to 63%.
Genes. Some of the genes identified are known to the epilepsy community. Gilissen and collaborators found de novo mutations in SCN2A, KCNA1, SPTAN1 and ALG13. Also, they could identify a de novo mutation in WDR45, a gene for neurodegeneration with brain iron accumulation (NBIA) that we had introduced in a previous post. The de novo mutation in SPTAN1 coding for spectrin, alpha, non-erythrocytic 1 is interesting, as mutations were only found in the context of epileptic encephalopathies so far. ALG13 is a gene found in the Epi4K study in two patients. All three mutations occur at the same site. The patient in the study of Gilissen and collaborators had an additional de novo mutation in the RAI1 gene, the gene mutated in some patients with Smith-Magenis-Syndrome.
Cohorts. When I looked at the phenotypes of the patients with explanatory mutations in the Supplement, I was slightly surprised. Many patients had distinct syndromes, which have intellectual disability as an associated feature. For example, the patient with the WDR45 de novo mutation had a phenotype compatible with SENDA (Static Encephalopathy with Neurodegeneration in Adulthood). The patient with the de novo mutation in KCNA1 had additional epilepsy and myokymia, which has previously been reported as a phenotype of this gene. However, there were also some surprises. A patient with de novo mutation in MECP2 was previously tested negative for this gene. Two patients with mutations in SMC1A, a gene for Cornelia-de-Lange Syndrome has much more severe phenotypes than usually seen in this syndrome.
Lessons. In summary, the authors expanded the panel of genetic diagnosis in their patient cohort by upgrading exomes to genomes. But can we estimate how good this technology actually works and whether it achieves the 60% diagnostic rate that is suggested by the authors. There are a few important caveats. Most importantly, the cohort studied by Gilissen and collaborators contained many patients with syndromic features. This might lead to a much higher diagnostic rate than in unselected cohorts of patients with severe intellectual disability. However, it is undeniable that there genetic causes hidden in the ~10% of the coding regions that is missed by exome sequencing.