De novo mutations in severe intellectual disability

Diagnostic exome sequencing. Severe intellectual disability (ID) is unexplained in the vast majority of patients and is thought to be genetic. The genetics of intellectual disability has traditionally focused on the X chromosome, where more than 100 possibly causative genes for ID are located. But other, autosomal genes are also found in large number of cases. A recent study in the New England Journal of Medicine now reports on trio exome sequencing in patients with unexplained severe intellectual disability. The authors identify causative de novo events in a large proportion of patients. Interestingly, more than half of their patients had epilepsy. Continue reading

Double Impact

Second hits. Genomic disorders are genetic disorders due to recurrent microdeletions or microduplications, i.e. small losses or gains of genomic material that happen again and again due to existing breakpoints in the human genome. Intriguingly, additional large microdeletions or microduplications can be identified in some patients with genomic disorders. A recent study in the New England Journal of Medicine tries to explain why. Continue reading

To do: read ENCODE papers

ENCODE will change the way we analyse genomes. The comparison of long non-coding RNA and transcription factor binding sites will require more CPU time. Anything else? I don’t know, I am only writing this because Ingo asked me to. It’ll take time to study the 30+ papers, sift through the data and discuss it with colleagues. Only then, something like that understanding we hear so much about can happen and I am sure it will in journal clubs around the globe in the next weeks. But smaller things might already be interesting.

Continue reading

The years of our fathers: paternal age and the rate of de novo mutations

Aging fathers. An increase in risk of aneuploidies, i.e. chromosomal aberrations such as Trisomy 21, is well established with maternal age.  Whether the paternal age also increases the risk for disorders in the offspring had long been disputed. However, a connection between paternal age and autism has been found in recent years. Now a recent study in Nature finds a surprisingly strong correlation on the genetic level… Continue reading

Will the relevant SNPs please stand up

The flood of variants.  Every re-sequencing of a genome leads to many more variants than can be validated with functional assays. Many strategies exist to select the candidate variants. Filtering on criteria might remove all variants so efforts are focused to re-rank the list of variants such that the most promising appear on top. A recent review in Nature Reviews Genetics wants to give users a hand with using the bioinformatics tools available. As a bioinformatician, I find a number of important points missing.

Continue reading

Somatic mutations affecting the mTOR pathway in hemimegalencephaly

Mutations, but not germline. Many of the genetic alterations that we aim to investigate within the EuroEPINOMICS projects are so-called germline mutations. In the case of de novo events, these mutations have occurred in the germ cells themselves or in very early development. In the case of autosomal dominant or recessive inheritance, the mutations have been transmitted from parents. In either case, the mutation can be found in every cell of the body. Cancer research is mainly focussed on somatic mutations, which give rise to malignant transformation in already differentiated tissues. In fact, many of the techniques that we currently use in neurogenetics were developed to study somatic genetic aberrations. Array comparative genomic hybridization for example, had initially been established for these purposes before expanding the focus to germline microdeletions and microduplications. While the role of somatic mutations in cancer research is well established, the role somatic rather than germline genetic alterations play in other disorders is mainly speculative. Some initial evidence for somatic point mutations has recently been found in Proteus syndrome, a rare overgrowth syndrome. Activating somatic mutations in AKT1 have recently been identified in this disorder. A recent paper by Lee and colleagues now identifies mutations in several genes in the mTOR pathway in patients with hemimegalencephaly, a severe form of brain malformation. Continue reading

One fish, two fish, red fish, blue fish – KCTD13 and neurogenetic studies in zebrafish

Microdeletions in seizure disorders. In a recent paper in Nature, Golzio and colleagues identified KCTD13 as the main driver for the neurodevelopmental phenotype of the 16p11.2 microdeletion. Small losses of chromosomal material as found in microdeletions usually affect several neighbouring genes. Many deletions are due to the particular duplication architecture of the human genome and are canonical, i.e. they always have the same size and include the same genes. The same duplication architecture also makes these variants relatively common, and the full impact of microdeletion-associated genetic morbidity has startled the neurogenetics. The recent five years have led to the identification of several epilepsy-related microdeletions including variants at 15q13.3, 16p13.11 and 15q11.2. There are further microdeletions that are usually found in patients with autism or intellectual disability and to a lesser extent in patients with epilepsy. The 16p11.2 microdeletion, the first microdeletion to be identified through a large-scale association study, is one of these variants.

From deletion to causative genes. For many microdeletions, the statistical evidence for the association with a particular phenotype is often beyond reasonable doubt given that several thousands samples can be included nowadays. The identification of the underlying causative gene, however, is extremely difficult. It is technically challenging and time-consuming to investigate all included genes functionally through conventional model systems. The function of many genes included in microdeletions are not related to ion channels, the best known pathological substrate in epilepsies, and hampers testing  effects through established electrophysiological techniques. Finally, microdeletions only lead to hemizygosity, i.e. the second copy of a gene should still be expressed at lower level, requiring model system looking for a quantitative rather than qualitative change. The bottom line is that epilepsy researchers are stuck without suitable model systems, which would allow for a medium-size throughput screening for genes in these deletions. This is where Danio rerio comes into play.

The zebrafish as a model for neurodevelopmental disorders. The zebrafish (Danio rerio) is a good model system for genetic and developmental research. The technologies for genetic manipulation are highly advanced. In addition, embryos are transparent and develop externally. Furthermore, a zebrafish develops quickly and produces a large number of offspring.  For her studies on developmental genetics using the zebrafish as a model system, Christiane Nüsslein-Volhard received the Nobel Prize for Medicine in 1995.

Screening of the candidate genes of the 16p11.2 microdeletion. Golzio and coworkers focussed on a peculiar aspect of the 16p11.2 microdeletion as an outcome parameter for their genetic screening – macrocephaly, i.e. an enlarged head circumference.  In contrast, patients with the corresponding 16p11.2 microduplication often show microcephaly, i.e. a reduced head circumference.  Golzio and colleagues deviced a system to measure head circumference in zebrafish embryos and then overexpressed the 29 genes contained in the 16p11.2 microdeletion in the developing embryo. Strikingly, only KCTD13 resulted in microcephaly. Macrocephaly was seen when KCTD13 was knocked-out with a morpholino. This demonstrated that up- or downregulation of KCTD13 affects head size. The authors went on to show that these differences in head size are driven by differences in neuronal proliferation. KCTD13 is highly expressed in the human forebrain and recent studies have suggested a role for excessive neurons in the frontal lobe in autism.

Figure 1. Study design by Golzio and coworkers to identify KCTD13 as the main gene within the 16p11.2 microdeletion responsible for micro- and macrocephaly. Neuronal proliferation or apoptosis underlies this phenomenon.

Application to epilepsy research. The authors combine a clever screening strategy with a convincing follow-up study, highlighting the potential of zebrafish studies in neurogenetics. However, head circumference is not identical with autism and only represents a surrogate parameter. Therefore, even though the authors emphasize the role of head circumference as an essential part of the 16p11.2 phenotypes, it only represents a minor aspect of it. Nevertheless, the authors demonstrate that Danio rerio is a good model system for medium-throughput screening strategies, and epilepsy models in zebrafish do exist, suggesting that this study design might help decipher the plethora of candidate genes arising from the genetic studies in EuroEPINOMICS.

No use in studying gene-gene and gene-environment effects in complex diseases?

Genome-wide association studies (GWAS) have improved our insight into the genetics of  complex diseases but have fallen short of initial expectations, leaving the majority of the heritabililty to be explained. Interactions of genes with the environments and with each other receive a fair share of the blame for the lack of progress despite the widespread efforts. The large number of possible interactions, however, currently still limits progress in this field.  A dedicated and growing group of computer scientists and geneticists now study gene-gene effects in the hope of shedding light on complex diseases. Initial results were hopeful, even in the field of epilepsy genetics.

Now, a group of Harvard based biostatisticians presented simulations for breast cancer, type 2 diabetes and rheumatoid arthritis that include gene-gene and gene-environment effects. Their interpretation reads bleak: little predictive power can be gained by including the additional dependencies, which means that all the CPU time consumed currently for their analysis is only warming the planet and the hearts of computer scientists.

Diabetes in the US

The large number of cases diabetes and many other complex widespread diseases are not explained easily. And the Aschard study suggests that it will remain so for the immediate future despite the progress in sequencing technology.

Negative predictions from experts for their own domain usually receive a negative backlash. The study could probably be attacked on the grounds that the authors selected a large number of parameters, some from probably little more than thin air. But the geneticists on twitter remained silent. Is this acceptance already? Maybe the critics still lie exhausted from attacking Vogelstein’s negative predictions from a couples of months ago.

If the statistical model and parameter choices find widespread acceptance, it would mean that it is virtually impossible to explain many complex diseases from genetics alone to a sufficient degree. As individual studies of the interactions of two SNPs are difficult enough, many cases of complex diseases will remain unexplained. Despite all the efforts, it would be almost as dark as before we had high-throughput sequencing facilities.

FAME – when phenotypes cross over but chromosomes don’t

Crompton and colleagues recently published the clinical and genetic description of a large family with Familial Adult Myoclonic Epilepsy (FAME).  This phenotype is particularly interesting since it provides some insight into how neurologists conceptualize twitches and jerks.  It is also a good example that large families do not necessarily result in a narrow linkage region, particularly when centromeric regions are involved.

What is myoclonus?  Despite usually mentioned in the context of epilepsy, most people are inherently familiar with myoclonus. Most of us “twitch” when we fall asleep and sometimes experience this twitch as part of a dream.  These episodes are entirely normal and are called hypnic jerks, but they give people a good idea of what a sudden, brief, shocklike, involuntary movement caused by muscular contraction or inhibition would feel like.  Myoclonus in the setting of epilepsy is usually mentioned as part of a Juvenile Myoclonic Epilepsy (JME) or Progressive Myoclonus Epilepsy (PME).  Please note that both epilepsies use different endings to describe the twitch (“-us” vs. “–ic”).  This is mainly convention.  Basically, myoclonus is a brief shock-like twitch, which can affect almost every part of the body and can be due to dysfunctions in various regions in the Central Nervous System.

The neuroanatomy of twitching.  A motor command from the cerebral cortex has to pass through several steps prior to execution.  For example, the simple command of tapping a finger on the table surface is prepared by the cortex through several loops before being sent down your spine.  Accordingly, myoclonus can arise from different parts in the brain.  (1) The cortical myoclonus is due to a purely cortical source and can be seen in many forms of symptomatic myoclonus.  (2) The cortico-subcortical myoclonus is due to feedback from the cortex to other brain areas. This is the myoclonus we see in patients with JME.  Both variants may be seen on EEG since the cortex is involved.  (3) The subcortical-supraspinal myoclonus is generated in the brain stem or below and is responsible for phenomena such as hyperekplexia or startle disease.   Some forms of hyperekplexia, literally “exaggerated surprise”, are due to mutations in genes involved in glycinergic transmission and can be found in some isolated communities such as the Jumping Frenchmen of Maine.  (4) Finally, there is also spinal and peripheral myoclonus.

FAME – epilepsy or movement disorder?  Familial Adult Myoclonic Epilepsy (FAME) is an enigmatic familial disorder with the triad of myoclonus, tremor and seizures.  Several families have been described and two loci on 8q23.3-8q24.11 and 2p11.1-q212.2 for FAME have been established.  The underlying genes are still unknown.  Crompton and colleagues no describe a large six-generation family with FAME in Australia/New Zealand.  The familial disease usually starts with tremor in early adulthood in the affected family members, even though a wide range of age of onset is observed. Interestingly, only a quarter of all affected family members had seizures, which is in contrast to previous studies.  Therefore, FAME may actually be better characterized as a movement disorder with concomitant seizures rather than a familial epilepsy syndrome.  The authors also point out the difficulties distinguishing FAME from the much more common essential tremor (ET).  In particular, the well-described response to β-blockers seen in patients with ET can also be observed in some family members.

Figure 1. The candidate gene landscape of the chr2 FAME region. All genes were searched for the number of hits in PubMed for the listed search terms in an automated fashion. As usual in large linkage intervals, only few genes are known in the context of neurological disorders, while most genes are unknown.

The genetics of FAME.  Crossovers during meiosis usually lead to a progressive narrowing of the linkage interval in familial disorders.  However, the lack of crossover events leads to very large linkage intervals even in very extended families.  The family described by Crompton et al. links to the pericentromeric region of chromosome 2.  Pericentromeric regions usually have a low frequency of crossover events, and this phenomenon has also delayed the identification of other familial epilepsies such as Benign Familial Infantile Seizures with mutations in PRRT2.  The linkage region contains almost 100 genes and Figure 1 shows the “candidate gene landscape” in this region.  While some genes clearly classify as top candidate genes, the majority of the genes in this region are unknown in the context of epilepsy. Therefore, identification of the FAME gene will be exciting and provide us with novel insight on how genetic alterations may produce combined neurological phenotypes.

Be literate when the exome goes clinical

Exomes on Twitter. Two different trains of thoughts eventually prompted me to write this post. First, a report of a father identifying the mutation responsible for his son’s disease pretty much dominated the exome-related twittersphere. In Hunting down my son’s killer, Matt Might describes his family’s journey that finally led to the identification of the gene coding for N-Glycanase 1 as the cause of his son’s disease, West Syndrome with associated features such as liver problems. The exome sequencing that finally led to the discovery was part of a larger program on identifying the genetic basis of unknown, putatively genetic disorders reported in a paper by Anna Need and colleagues, which is available through open access. This paper is an interesting proof-of-principle study that exome sequencing is ready for prime time. Need and colleagues suggest exome sequencing can find causal mutations in up to 50% of patients. By the way, a gene also that turned up again was SCN2A in a patient with severe intellectual disability, developmental delay, infantile spasms, hypotonia and minor dysmorphisms. This represents a novel SCN2A-related phenotype, expanding the spectrum to severe epileptic encephalopathies.

The exome consult. My second experience last week was my first “exome consult”. A colleague asked me to look at a gene list of a patient to see whether any of the genes identified (there were 300+ genes) might be related to the patient’s epilepsy phenotype. Since I wasn’t sure how to best handle this, I tried to run an automated PubMed search for combination of 20 search terms with a small R script I wrote. Nothing really convincing came up except the realisation that this will be an issue that we will be increasingly faced in the future: working our way through exome dataset after the first “flush” of data analysis did not reveal convincing results. Two terms that came to my mind were bioinformatic literacy as something that we need to improve and Program or be Programmed, a book by Douglas Rushkoff on the “Ten commands of the Digital Age”. In his book, he basically points out that in the future, understanding rather than simply using IT will be crucial.

The cost of interpretation is rising. The Genome Center in Nijmegen suggests on their homepage that by the year 2020, whole-genome sequencing will be a standard tool in medical research.  What this webpage does not say is that by 2020, 95% of the effort will not go into the technical aspects of data generation, but into data interpretation. For biotechnology, interpretation will be the largest marketing sector.

By 2020, probably more than 10 million genomes will have been sequenced. Data interpretation rather than data generation will represent the most pressing issue.

So, what about epilepsy? “50% of cases to be identified” sounds good for any grant proposal that I would write, but this might be a clear overestimate. Need and colleagues used a highly selected patient population and even in the variants they identified, causality is sometimes difficult to assess. We are maybe much further away from clinical exome sequencing in the epilepsies than we would like to admit. The only reference point we have for seizure disorders to date is large datasets for patients with autism and intellectual disability. While some genes with overlapping phenotypes can be identified, we would virtually be drowning in exome data without being capable of making sense of this.

10,000 exomes now. I would like to predict that after having identified some low-hanging fruits with monogenic disorders, 10,000 or more “epilepsy exomes” would have to be collected before making significant progress. It is, therefore, crucial not to be tempted by wishful thinking that particular epilepsy subtypes necessarily have to be monogenic, as in the case of epileptic encephalopathies or other severe epilepsies. Much of the genetic architecture of the epilepsies might be more complex than anticipated, requiring larger cohorts and unanticipated perseverance.