The decoding of the human genome is among the great scientific achievements. For a copyeditor, decoding genetics terminology in research articles also can feel like quite an achievement, albeit on a quite smaller scale.
Despite established international guidelines for gene nomenclature, written discussion of genes and proteins is frequently unclear. Does that reference to APOE mean the APOE gene or the protein? When the authors wrote about Foxp2 expression, were they referring to the human gene or the mouse gene? Contextual clues, general familiarity with the nomenclature guidelines, and a few good sources can help the copyeditor discern mice from men, proteins from genes, when the author is not specific.
By convention, symbols assigned to human gene names typically comprise uppercase, italicized letters and may contain Arabic numerals. For example, the approved symbol for the forkhead box P2 gene is FOXP2. In roman type, the same symbol refers to the protein encoded by that gene: FOXP2. In an animal model, the same gene might have only the first letter in uppercase and the remaining letters in lowercase (e.g., Foxp2). Variations in nomenclature exist for other entities such as bacteria, viruses, mutated genes, and oncogenes, but determining if the author is referring to a gene or a protein is usually straightforward when the guidelines are followed.
Where it can get particularly confusing for the reader—and where the copyeditor becomes sleuth—is when the symbols and the terminology don’t quite match. If an action is being discussed, the protein is most likely either the instigator or the subject of the action. Mention of transcription, translation, or regulation likely indicates discussion of the gene. But both genes and proteins can be expressed or have mutations. As “the last editor and the first reader,” the copyeditor must be able to determine whether the authors’ intended meaning is clear.
An additional complication is that gene names change in response to ongoing research. A search of the HUGO Gene Nomenclature Committee’s website (http://www.genenames.org/) shows, for example, that the symbol SPCH1 has been withdrawn and that FOXP2 should be used instead. As copyeditors, we have to determine how to present this genetic content clearly and accurately while preserving the author’s meaning and respecting common usage in the field.
Almost 33,000 symbols have been approved for human genes. The symbols are easier to use in spoken and written communication about genes, forming something of a lingua franca among geneticists and molecular biologists. Careful editing for both clarity and accuracy, querying authors when contextual clues are elusive, and requesting that authors carefully review genetics terminology in their manuscripts all help ensure that research won’t get lost in translation.
We find these resources useful when editing gene nomenclature:
- The searchable database of the Human Gene Organizations Human Gene Nomenclature Committee: http://www.genenames.org/
- AMA Manual of Style, 10th ed. New York, NY: Oxford University Press; 2007:608-659. Section 15.6.2-15.6.5.
- The U.S. National Institutes of Health Gene website: http://www.ncbi.nlm.nih.gov/gen