Gene family matters: expanding the HGNC resource

  • PDF / 548,670 Bytes
  • 6 Pages / 595.28 x 793.7 pts Page_size
  • 117 Downloads / 194 Views

DOWNLOAD

REPORT


GENOME DATABASE

Open Access

Gene family matters: expanding the HGNC resource Louise C Daugherty*, Ruth L Seal, Mathew W Wright and Elspeth A Bruford

Abstract The HUGO Gene Nomenclature Committee (HGNC) assigns approved gene symbols to human loci. There are currently over 33,000 approved gene symbols, the majority of which represent protein-coding genes, but we also name other locus types such as non-coding RNAs, pseudogenes and phenotypic loci. Where relevant, the HGNC organise these genes into gene families and groups. The HGNC website http://www.genenames.org/ is an online repository of HGNC-approved gene nomenclature and associated resources for human genes, and includes links to genomic, proteomic and phenotypic information. In addition to this, we also have dedicated gene family web pages and are currently expanding and generating more of these pages using data curated by the HGNC and from information derived from external resources that focus on particular gene families. Here, we review our current online resources with a particular focus on our gene family data, using it to highlight our new Gene Symbol Report and gene family data downloads.

The HGNC: background and relevance The HUGO Gene Nomenclature Committee (HGNC) has been responsible for approving unique and informative gene names and symbols to every human gene for over 30 years. Approved gene names and symbols preferably describe the structure, function or homology of a gene and its products. The provision of approved nomenclature allows researchers to discuss genes unambiguously, and this is reflected by HGNC symbol usage in scientific papers describing human genes, hence aiding the dissemination and interpretation of the associated data by the scientific community. The HGNC website [1,2] provides direct links to genomic, proteomic and phenotypic information that is held in the HGNC database and enables users to search and download current data associated to their gene(s) of interest. As of February 2012, there are over 33,000 approved human gene symbols (including proteincoding genes, pseudogenes, ncRNA genes and phenotypes), each with a publicly available Gene Symbol Report. It is important to note that although the main focus of HGNC concerns human genes, there are coordinated efforts with other nomenclature committees [3], in particular the Mouse Genomic Nomenclature Committee (MGNC) [4] and Rat Genome Database (RGD) * Correspondence: [email protected] European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK

[5], and any large new gene family reorganisation or assignment is usually coordinated among these three nomenclature groups. The HGNC also regularly works with specialist advisors and publish scientific papers concerning gene family nomenclature and gene grouping [6-9]. The adoption of HGNC-approved gene names/ symbols by the many genome browsers and databases reduces any uncertainty when referring to genes; for example, Ensembl [10], Entrez Gene [11], GeneCards [12], OMIM [13],