SCN2A – a neurodevelopmental disorder digitized through 10,860 phenotypic annotations

HPO. SCN2A-related disorders represent one of the most common causes of neurodevelopmental disorders and developmental and epileptic encephalopathies (DEE). However, while a genetic diagnosis is easily made through high-throughput genetic testing, SCN2A-related disorders have such a broad phenotypic range that understanding the full scale of the clinical features has been traditionally difficult. In our recent study, we used a harmonized framework for phenotypes based on the Human Phenotype Ontology (HPO) to systematically curate phenotypic annotations in all individuals reported in the literature and followed at our center, a total of 413 unrelated individuals. Mapping phenotypic data onto 10,860 terms with 562 unique concepts and applying some of the computational tools we have developed over the last three years, we were able to delineate the phenotypic range in unprecedented detail. SCN2A is now the first DEE with all available data systematically curated and harmonized in a computable format, allowing for entirely novel insights. Continue reading

Entering the phenotype era – HPO-based similarity, big data, and the genetic epilepsies

Semantic similarity. The phenotype era in the epilepsies has now officially started. While it is possible for us to generate and analyze genetic data in the epilepsies at scale, phenotyping typically remains a manual, non-scalable task. This contrast has resulted in a significant imbalance where it is often easier to obtain genomic data than clinical data. However, it is often not the lack of clinical data that causes this problem, but our ability to handle it. Clinical data is often unstructured, incomplete and multi-dimensional, resulting in difficulties when trying to meaningfully analyze this information. Today, our publication on analyzing more than 31,000 phenotypic terms in 846 patient-parent trios with developmental and epileptic encephalopathies (DEE) appeared online. We developed a range of new concepts and techniques to analyze phenotypic information at scale, identified previously unknown patterns, and were bold enough to challenge the prevailing paradigms on how statistical evidence for disease causation is generated. Continue reading

Big data, ontologies, and the phenotypic bottle neck in epilepsy research

Unconnected data. Within the field of biomedicine, large datasets are increasingly emerging. These datasets include the genomic, imaging, and EEG datasets that we are somewhat familiar with, but also many large unstructured datasets, including data from biomonitors, wearables, and the electronic medical records (EMR). It appears that the abundance of these datasets makes the promise of precision medicine tangible – achieving an individualized treatment that is based on data, synthesizing available information across various domains for medical decision-making. In a recent review in the New England Journal of Medicine, Haendel and collaborators discuss the need in the biomedical field to focus on the development of terminologies and ontologies such as the Human Phenotype Ontology (HPO) that help put data into context. This review is a perfect segue to introduce the increasing focus on computational phenotypes within our group in order to overcome the phenotypic bottleneck in epilepsy genetics. Continue reading