Unlocking Novel Insights Into Rare Disease Commonalities Through Multimodal Data Integration
Population frequency and impact analysis predictions for all dbSNP, gnomAD and ClinVar variants in rare disease genes.
Links to literature articles on rare diseases with genetic etiology and rare disease associated genes.
Manually curated variant details and associated clinical symptoms.
Integration with 3D protein structure to visualize structural impact of the curated and annotated variants.

Rare Disease Information
Rare diseases may individually be rare but are collectively common. Details from
GARD and the disease harmonization database at NCATS are integrated to allow investigation of
rare diseases with genetic etiology. The features allow easy access to links on published rare
disease literature, disease databases and other supporting data sources.

Gene Information
Details on genes associated with rare diseases are presented with integrated
information on their aliases and disease associations obtained from multiple data sources. The
feature allows access to links to published literature, associated variants, and other
supporting data sources.

Literature AI
Natural language processing including transformer models are used to find rare
disease and their associated gene references in the title or abstract of articles published in
PubMed. The feature integrates disease aliases obtained from the disease harmonization database
NCATS and multiple gene term variations from the EntrezGene database.

Manually Curated Variants
A manual curation workflow was developed to provide a high accuracy
correlation data set. Curated annotations include clinical symptoms, functional analysis and
interactive visualizations for detailed review of the variants. The initial disease manually
was X-linked creatine transporter deficiency (CTD) and the associated disease-causing gene,

Variant Annotations and 2D/3D Protein Structures
Thousands of variants in genes associated with rare diseases were downloaded and
annotated. UniProt, MolArt and annotated varaints are integrated to determine the variant
positions on the protein sequence and structure. Interactive visualization helps to elucidate
the functional impact of the variants.