NHGRI/DIR Research Projects

  • Animal Proteome DataBase (AniProtDB)
    A comprehensive database of high-quality metazoan proteomes generated in a robust and consistent fashion from SRA data. The database includes information on predicted proteins and protein domains and the ability to perform sequence similarity searches against all proteomes generated using this pipeline.
  • Atlas of Human Malformation Syndromes in Diverse Populations
    A photo atlas of individuals from diverse populations who are affected by malformation syndromes.
  • Breast Cancer Information Core (BIC)
    A central repository for information regarding mutations and polymorphisms in breast cancer susceptibility genes.
  • Clinical Genomic Database (CGD)
    A manually-curated database of all conditions with known genetic causes, focusing on the utility of genetic/genomic diagnosis and the availability of disease-specific interventions.
  • CRISPRz
    A curated database of validated CRISPR targets in zebrafish.
  • Goldfish Genome Project
    Genome browser for the goldfish, Carassius auratus, including tracks for gene expression and multispecies sequence conservation.
  • Hydra 2.0 Genome Project Portal
    Access to sequence data and related information on Hydra, a valuable experimental model for the study of numerous biological processes, including regeneration, senescence, axial patterning, cell signaling, and development.
  • Hydra AEP Genome Project Portal
    Access to the newly sequenced chromosome-level assembly and related data on the AEP-strain of Hydra vulgaris, a valuable experimental model for the study of fundamental biological principles including patterning, stem cell biology, aging and regeneration.
  • Hydractinia Genome Project Portal
    Access to sequence data and related information on Hydractinia, a model system for the study of fundamental biological processes such as regeneration, allorecognition, and stem cell biology.
  • Limb Morphology Database
    Standardized terms used to describe human morphology developed by an international group of clinicians working in the field of dysmorphology.
  • Mnemiopsis Genome Project Portal
    Access to the annotated Mnemiopsis genomic sequence, the first set of publicly available whole-genome sequencing data from any ctenophore species.
  • Multiplex Initiative
    A large, multi-disciplinary research collaboration to examine the effects of genetic susceptibility testing for several common health conditions.
  • NHGRI Dog Genome Project
    Information for researchers and dog owners interested in finding the genetic basis of morphologic traits, behaviors, or diseases in the domestic dog in order to improve the health and well being of dogs and their human companions.
  • Pallister-Hall Syndrome
    Information for professionals and families caring for or affected by Pallister-Hall Syndrome.
  • Pigment Cell Gene Resource
    A centralized, comprehensive resource of published scientific data relevant to pigment cell biology.
  • Red Cell Membrane Disorder Mutations Database
    A database containing confirmed mutations to inherited disorders of the erythrocyte membrane associated with hemolytic anemia including Hereditary Spherocytosis (HS), Hereditary Elliptocytosis (HE), and Hereditary Pyropoikilocytosis (HPP).
  • Supplementary Material
    Supplementary material not available through publishers' Web sites for NHGRI manuscripts published from 2012–present.
  • Zebrafish Insertion Collection (ZInC)
    A Web-based, searchable collection of zebrafish mutations generated by DNA insertion.

NHGRI/DIR-Developed Software and Analysis Tools

  • ampliconDIVider
    ampliconDIVider identifies deletion and insertion variants (DIVs) in DNA amplicons.
  • bam2mpg
    A Bayesian genotype caller for NextGen sequencing data.
  • BuddySuite
    A collection of four independent, yet interrelated, command line programs that facilitate each step in the workflow of sequence discovery, curation, alignment, and phylogenetic reconstruction.
  • Complementary Pairs Stability Selection for Genome-Wide Association Studies (ComPaSS-GWAS)
    An ad-hoc alternative to replication that can reduce type I errors for GWA studies when appropriate replication data are not available.
  • Conserved Domain-based Prediction (CDPred)
    A computational algorithm that is designed to theoretically calculate the effect of substituting an amino acid relative to the reference sequence within functional modules - the protein domains.
  • Entanglement Mapping
    Entanglement mapping (EM) is a new Random Forests based method for detecting interactions between important features, such as epistatic interactions between SNPs in genome-wide association studies.
  • GeIST
    A set of files and scripts used to detect and annotate MLV integration sites.
  • GeneLink
    A data management system designed to facilitate genetic studies of complex traits.
  • Genometric Analysis Simulation Program (G.A.S.P.)
    A software tool that can generate samples of family data based on user specified genetic models.
  • r2VIM
    A new recurrency-based variable selection method in random forests for genome-wide genetic association studies.
  • ROMPrev
    A software suite for quantitative trait and locus-specific heritability estimation and association testing using the revised ROMP method.
  • Shimmer
    A tool for the detection of genetic alterations in tumors from Next Generation sequence data.
  • SKIPPY
    A tool for scoring exonic variants for features associated with exon skipping and ectopic splice site creation.
  • SOOP
    A tool for the design and selection of overgo probes optimized for high-throughput comparative mapping.
  • SubmiRine
    A software package for predicting microRNA target site variants (miR-TSVs) from clinical genomic data sets.
  • Tiled Regression Analysis
    A software framework for selecting a set of genetic predictors which jointly and independently explain trait variation with an additive regression model.
  • trieFinder
    A tool that rapidly maps sequence tags to RefSeq, UniGene, and genomic sequences, providing output amenable to both transcript quantification and the detection of novel transcripts.
  • Var-MD
    An annotation and analysis tool for next-generation sequencing variants in rare diseases and small pedigrees.
  • VarSifter
    VarSifter is a graphical java program designed to display, sort, filter, and generally sift variation data from massively parallel sequencing experiments.