Skip navigation
help
CTD: The Comparative Toxicogenomics Database

Overview

Contents

  1. Introduction
  2. CTD Development
  3. Primary Data Categories
  4. Footnotes

Introduction

CTD advances understanding of the effects of environmental chemicals on human health.

The etiology of most chronic diseases involves interactions between environmental factors and genes that modulate important physiological processes.1,2 This assumption is supported by the many complex diseases caused by reversible behaviors or avoidable exposures, and by the relatively rare number of diseases attributed to single gene mutations.2 Environmental factors are implicated in many common conditions such as asthma, cancer, diabetes, hypertension, immune deficiency disorders, and Parkinson’s disease; however, the molecular mechanisms underlying these correlations are not well understood.3

CTD includes manually curated data describing cross-species chemical–gene/protein interactions and chemical– and gene–disease relationships to illuminate molecular mechanisms underlying variable susceptibility and environmentally influenced diseases. These data will also provide insights into complex chemical–gene and protein interaction networks.

Top To top of page CTD Development

CTD is a community-supported public resource. It is being developed at the Mount Desert Island Biological Laboratory (MDIBL), a Marine and Freshwater Biomedical Science (MFBS) Center of the National Institutes of Environmental Health Sciences (NIEHS).

Top To top of page Primary Data Categories

Chemicals. Integrated with CTD is a chemical subset of the Medical Subject Headings (MeSH®), the hierarchical vocabulary from the U.S. National Library of Medicine. Chemicals, genes and human diseases are associated through references. Gene-Disease Relationships Chemical-Disease Relationships Chemical-Gene Interactions Diseases Genes Chemicals You can view relationships among chemicals, obtain detailed information about them (including structure and toxicology data) and access related CTD data (including references and sequences) using the chemical browser. You can also use it to formulate gene, interaction, and reference queries.

Genes. The cross-species gene vocabulary (symbols, names, and synonyms) in CTD is derived from the Gene database at the National Center for Biotechnology Information (NCBI), a division of the U.S. National Library of Medicine. CTD curators may add to this vocabulary as required (e.g., to represent a species-specific gene that is not curated in NCBI Gene). Chemical, curated interaction, disease, reference, Gene Ontology, organism and sequence data are also provided for genes. You can search for genes by chemical name/symbol, chemical–gene interaction type, gene name/symbol, disease, organism, Gene Ontology annotation, sequence accession identifier, or inclusion in a microarray experiment from the Environment, Drugs, and Gene Expression database (EDGE).

Chemical–Gene/Protein Interactions. To improve understanding about the mechanisms of chemical actions, we are manually curating specific chemical–gene and protein interactions in vertebrates and invertebrates from the published literature. These interactions are both direct (e.g., “chemical binds to protein”) and indirect (e.g., “chemical results in increased phosphorylation of a protein” via intermediate events).

Interactions are curated using a controlled interaction vocabulary that characterizes common physical, regulatory, and biochemical interactions between chemicals and genes or proteins. This vocabulary comprises 70 terms including actions (e.g., “binds to”, “imports”), operators that describe the degree of a chemical's effect (e.g., “increases”), and qualifiers that specify the form of the gene or chemical involved in an interaction (e.g., “protein” or “chemical metabolite,” respectively).

You can search curated interactions directly on the interaction query form. You may also search them on a gene or reference query form by using the “chemical–gene interaction types” field. Curated interactions are presented via gene, reference and chemical detail pages. References that have not yet been curated manually are presented in CTD with chemical–gene/protein associations that have been extracted by an automated information retrieval method. Performing gene or reference queries without the “chemical–gene interaction types” field, you will search both manually curated interactions and automated chemical–gene associations.

Diseases. CTD diseases consist of genetic disorders from the Online Mendelian Inheritance in Man (OMIM) database at the National Center for Biotechnology Information (NCBI) and the disease subset of the Medical Subject Headings (MeSH®), resources available through the U.S. National Library of Medicine. CTD curators mapped OMIM diseases to terms within the hierarchical MeSH disease vocabulary to expand disease representation in CTD. This combined disease vocabulary is used to curate gene–disease and chemical–disease relationships. You can browse diseases and use them to formulate gene and reference queries.

Gene–Disease Relationships. CTD contains direct and inferred gene–disease relationships. Direct gene–disease relationships are curated from the published literature by CTD curators, or are derived from the OMIM database using the mim2gene file from the NCBI Gene database. Inferred relationships are established via CTD–curated chemical–gene interactions (e.g., gene A is associated with disease B because gene A has a curated interaction with chemical C, and chemical C has a direct relationship with disease B). Direct and inferred relationships are identified, and help users develop hypotheses about mechanisms underlying environmental diseases.

Chemical–Disease Relationships. CTD contains direct and inferred chemical–disease relationships. Direct chemical–disease relationships are curated from the published literature by CTD curators. Inferred relationships are established via CTD–curated chemical–gene interactions (e.g., chemical A is associated with disease B because chemical A has a curated interaction with gene C, and gene C has a direct relationship with disease B). Direct and inferred relationships are identified, and help users develop hypotheses about mechanisms underlying environmental diseases.

References. CTD contains references related to toxicologically significant vertebrate and invertebrate genes, sequences and associated chemicals. References were identified by information retrieval methods and comprise a subset of MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine. You can search for references by chemical, gene, organism taxon, disease, citation information and accession identifier. Using manual and automated methods, we are curating chemical–gene interactions from these references.

Organisms. CTD's hierachical organism vocabulary consists of the Eumetazoa (vertebrates and invertebrates) component of the Taxonomy Database from NCBI, a division of the U.S. National Library of Medicine. You can browse organisms and use them to formulate gene, interaction, and reference queries.

Sequences. CTD contains both nucleotide and protein sequences. Nucleotide sequences are acquired from NCBI, a division of the U.S. National Library of Medicine. For Homo sapiens, Mus musculus, Rattus norvegicus, Drosophila melanogaster and Caenorhabditis elegans, only Reference Sequences (RefSeqs) are included. For all other vertebrates and invertebrates, both RefSeq and GenBank® sequences are included. Protein sequences are acquired from EBI; all vertebrate and invertebrate UniProt sequences are included. You can access sequences via a gene query or the chemical browser.

Gene Ontology (GO). GO annotations are integrated with gene and UniProt (Swiss-Prot/TrEMBL) protein sequence data in CTD. You can browse GO and use it to formulate gene and interaction queries.

Pathways. KEGG pathway data is a collection of manually drawn pathway maps representing our knowledge on the molecular interaction and reaction networks. These data are integrated with chemicals, genes and diseases in CTD to provide insights into molecular networks that may be affected by chemicals, and possible mechanisms underlying environmental diseases. You can use KEGG pathway names or KEGG IDs to formulate gene and interaction queries. Pathway information is provided on chemical, gene/protein and disease detail pages.

Top To top of page Footnotes

[1]
Schwartz DA, Freedman JH, Linney EA. Environmental genomics: a key to understanding biology, pathophysiology and disease. Hum Mol Genet. 2004 Oct 1;13 Spec No 2:R217-24. [PubMed]
[2]
Olden K, Wilson S. Environmental health and genomics: visions and implications. Nat Rev Genet. 2000 Nov;1(2):149-53. [PubMed]
[3]
Toscano WA, Oehlke KP. Systems biology: new approaches to old environmental health problems. Int J Environ Res Public Health. 2005 Apr;2(1):4-9. [PubMed]