CNV. There are different forms of genetic variation and historically, our ability to query the entire exome or genome is a relatively recent development. However, the first type of genetic variation that could be assessed in the epilepsies in large cohorts were copy number variations (CNV), small gains or losses of chromosomal materials. In a recent study, the entire Epi25 cohort was analyzed for CNVs, giving a long-needed update on the role of the structural genomic variations in various forms of epilepsies and highlighting that the overall landscape of CNVs in the epilepsies is well understood and delineated. With up to 3% of individuals with epilepsies carrying some of the recurrent CNVs, this type of genomic variation remains a rare, but important source of genetic morbidity in the epilepsies.
History of the CNVs. Before exomes, there were CNVs. I started my scientific career in epilepsy genetics working on structural genomic variants in the epilepsies that we initially investigated in the EPICURE cohort. Initially, this cohort had been recruited for a genome-wide association study (GWAS). However, the first results emerging from this cohort were not the association results, but the analysis of structural genomic variants. The Single Nucleotide Polymorphism (SNP) arrays served a dual purpose. While genome-wide association studies looked at the identity of markers, novel algorithms to look at the quantity or dosage of information were emerging. In this blog post, I am linking out to some of my older posts on CNVs that have useful illustrations to explain what we know about CNVs.
The CNV revolution. It is one of many twists and turns in the history of epilepsy genetics that CNVs emerged from a study that was initially not meant to detect them. However, since our initial work, structural genomic variants including microdeletions at 15q13.3, 16p13.11, and 15q11.2 were considered potential risk factors in the epilepsies, particularly the genetic generalized epilepsies. In some of my presentations where I cover CNVs, I frequently refer to the CNV revolution. Even though CNVs only explain a small proportion of the overall genetic risk in the epilepsies, it was the study of CNVs that made large studies in the epilepsies possible. The discovery of CNVs occurred at a critical juncture in epilepsy genetics where association studies had been performed frequently in small cohorts and were often plagued by non-reproducible findings. CNVs changed all of this.
Risk factors. Once discovered, epilepsy-related CNVs revealed a property that made them extremely difficult to interpret in clinical practice. These genomic alterations were largely risk factors, not “monogenic” causes (quotation marks, as a CNV includes multiple genes). This means that many of the identified CNVs behaved differently than we would expect of disease-causing variants in SCN1A or STXBP1. While some CNVs such as the 15q13.3 microdeletion or the 16p11.2 microdeletion were typically de novo and often fully explained the individual’s epilepsy, many CNVs ran in families and were also found in unaffected individuals., Many were truly associated with epilepsy, as was the case for the 16p13.11 microdeletion. Other microdeletions made it more difficult.
Thresholds. The 15q11.2 microdeletion is a CNV that has been difficult to prove or disprove. As of 2020, the 15q11.2 microdeletion remains a mystery. It may represent a mild risk factor, but it has the unusual property of continuing to “hover” just below the significant threshold. The association signal for this variant was never truly there, nor did it completely go away. Clinically, only a subset of the CNVs have emerged as truly relevant in patient care, such as the 15q13.3, 16p11.2, 22q11.2 (22q deletion syndrome), and the 1q21.1 deletion. Many other CNVs remain association signals and risk factors that do not fully explain the patient’s disease.
Default breakpoints. The recurrent CNVs mentioned above are relatively frequent due to an unusual feature in our genome – segmental duplications. Our genome contains many brief segments that are almost identical. During DNA replication, the replication machinery may mistake one segment for another, and the intervening sequence is deleted or duplicated. As these default breakpoints have precise locations in the genome, the recurrent microdeletions arising from the genomic feature are always the same size. Therefore, the CNVs are typically only referred to by their chromosomal location, such as 16p13.11.
CNV study. In a recent publication in Brain, Niestroj and collaborators assessed the Epi25 study disease-causing CNVs, including 10,712 individuals with epilepsy of European descent and 6,746 ancestry-matched controls. This dataset is the largest dataset for CNV analysis in the epilepsies to date and represents the dataset that is analyzed in the most stringent way. All individuals were genotyped using SNP arrays and were strictly matched. The scope of the study alone makes this publication the definitive study on CNVs in the epilepsies. In line with our Epi25 criteria, the phenotypes included were the developmental and epileptic encephalopathies (DEE), Genetic Generalized Epilepsy (GGE), NAFE non-acquired focal epilepsy (NAFE), and lesional focal epilepsy (LFE).
CNVs burden. What did the study find? First, the DEEs and GGEs stand out as the diseases with the highest CNV burden. Both the DEEs and GGEs had an approximately four-fold higher frequency of deletions larger than 2 Mb compared to controls. For the DEEs, this enrichment was expected given the overall disease severity and comorbidity with other neurodevelopmental disorders. For the GGEs, however, this finding is striking and confirms our prior findings that the GGEs have a higher frequency of large deletions independent of any neurodevelopmental comorbidity. The strongest risk factor for GGE with an odds ratio of 36 is the 15q13.3 microdeletion, which is a much stronger risk factor for GGE than for any other epilepsy.
The CNV landscape. The recurrent CNVs are also referred to as CNV hotspots. These hotspots accounted for a significant proportion of deletions in the CNV group, but there were many other deletions and duplications in patients with epilepsy. When a group of rare deletions is analyzed, it is often difficult to assess whether a single, newly identified deletion is causative. However, when looking across a large number, patterns emerge. For example, the DEEs and GGEs showed a clear enrichment of genes intolerant to variation (pLI > 0.95), i.e. genes that we would consider candidates when de novo variants were found in these genes. There was also enrichment of known epilepsy-associated genes.
What you need to know. The publication by Niestroj and collaborators is the definitive study on CNVs in the epilepsies, replicating prior findings at a much larger scale. The CNV landscape in the epilepsies is defined by recurrent hotspot deletions, with the 15q13.3 microdeletion emerging as a strong risk factor for GGE. Beyond the known recurrent hotspots, there are no other regions across the genome where recurrent, large CNVs are hidden – the CNVs landscape of the epilepsies has been explored. Individuals with epilepsy carry CNVs in up to 3% of cases, mainly some of the known recurrent variants.