DOIs: A new era for germplasm labelling

DOIs hold great promise for tracking the use of germplasm

Nicola Temple, Scriptoria

Genebank collections around the world hold the raw genetic material needed to breed the diverse crops necessary to feed a burgeoning human population in an increasingly uncertain environment. These collections are not stagnant, held frozen behind lock and key, but dynamic repositories of plant material and associated information that is used for scientific research and crop breeding programs. For this reason, the quality and management of the information associated with plant genetic resources for food and agriculture (PGRFA) is as important as the material itself.

Historically, every genebank has decided on a method for labelling its own accessions. Genebank staff use specific combinations of letters and numbers to create a unique identifier for each accession. They use this unique code throughout the chain of custody, on bags of seeds in the freezer, on stakes in the greenhouse and in their database. This system still works well for managing a single genebank. However, there is no standardised or shared method for assigning unique identifiers to accessions in all genebanks. This makes it difficult to track relationships as these genetic resources are shared, duplicated and used, which limits the capacity to associate new information to the accessions that contributed to generating it.

Consider this scenario: a genebank has a chickpea labelled ‘ABC’ in its collection. If the genebank publishes anything about the material, it will refer to its local identifier, ‘ABC’, along with the scientific name and other pertinent information. If the genebank provides a sample of this germplasm to an institution conducting crop breeding research, the institution might relabel this material ‘123’ to catalogue it within its own identification system. Any publications that result from this research would then refer to a chickpea called ‘123’. So, despite two or more publications arising from the same genetic material, there is no easy way of connecting the dots, as each cites only their local identifier. Following the material becomes as challenging as tracking down a spy who travels with multiple passports.

Unlike spies, though, plant genetic resources work best when they are easy to identify. Maintaining a single identity, history and body of knowledge around an accession is essential for worldwide scientific research and use of diversity. Germplasm often travels on complicated journeys, not only being shared between institutions but also being crossed with other material or altered through selection. Although genebanks usually ask researchers to provide links to any publications using their source materials, this doesn’t always happen. Important information about plant genetic resources is missed, to the detriment of agriculture and food security.

The key to maintaining germplasm relationships

But that’s in the past. Digital Object Identifiers (DOI) have now been chosen to provide a globally unique and permanent mechanism for identifying germplasm. A DOI is a standardised alphanumeric string that is assigned by a registration agency and provides a persistent link to the location of information about the object on the Internet (unlike URLs, which can lose connectivity if they aren’t updated). Importantly, DOIs coexist with other identifiers, such as the ones already used by genebanks, allowing curators to keep their current systems in place. So chickpea ‘ABC’ and chickpea ‘123’ don’t have to change their names; they simply have a new identity using a DOI.

DOIs were introduced in 2000 by the International DOI Foundation and have largely been used for scholarly publications, datasets and commercial videos. Now they hold great promise for tracking the use of germplasm.

“Plant genetic resources with assigned DOIs will be easier to discover through digital means,” explains Nora Castañeda-Álvarez, Genesys Catalog curator at the Crop Trust. “We expect that the adoption of this standard will help to better track specific accessions to scientific publications and documents.”

The process is relatively straightforward. The provider registers the germplasm with the Global Information System (GLIS) on PGRFA, which is managed by the International Treaty on PGRFA, by giving a description that includes some minimum information: the holder of the material; the scientific name; the method and date of collection; and the holder’s local identifier for the material. GLIS then returns a DOI to the provider, which is permanently assigned to the physical material itself. The DOI is assigned to the material rather than the description of the material because descriptions can change: species and sub-species designations can shift as more is learned about the crop.

On the other hand, this does not mean that chickpea ‘ABC’ and chickpea ‘123’ will both become chickpea ‘10.18730/A1B2’. A new DOI will be assigned if germplasm is transferred and added to the collection of an institution. This happens because the germplasm is subject to new conditions, not only within the new holder’s facilities and their quality assurance standards, but also through the transfer process itself. Thus a new DOI is provided to allow for any changes in the material, but its relationship to the previous DOI, as well as any future DOIs, is maintained. This information is recorded in the GLIS.

Therefore, our hypothetical pair of chickpea stewards would only make a slight addition to their process in sharing genetic resources. The genebank with its accession ABC registers it with GLIS and is provided with a DOI, ‘10.18730/A1B2’. Any publications or datasets produced by the genebank cite both ‘ABC’ and ‘10.18730/A1B2’ to reference the germplasm. As before, the research institution receiving the germplasm relabels it as ‘123’ for its working collection. But this institution also registers the accession with GLIS, providing both their local identifier (‘123’) and the DOI sent along with the seeds by the genebank. GLIS then issues a new DOI to the research institution, ‘10.18730/Z9Y8’. Any publications arising from this material references both ‘123’ and ‘10.18730/Z9Y8’. However, unlike in the previous scenario, the relationship between the two DOIs is maintained. With a simple web search in GLIS, anyone can see how these germplasms are connected.

“Maintaining this relationship is particularly important when material from genebanks is used to develop new varieties,” explains Matija Obreza, Genesys Information Systems Manager at the Crop Trust. “The proper identification of parents via DOIs allows the use of the material to be tracked and that helps when determining the impact of genebank collections. DOIs also allow for automated discovery of publications by scanning the Internet for a specific DOI.”

DOIs in action

The Genebank Platform aims to have DOIs assigned to 100% of accessions in the international genebanks managed by CGIAR centres by 2018. However, there are many genebanks that have already started to use DOIs. The Secretariat of the International Treaty on PGRFA provides DOI minting free of charge and has assigned over 560,000 DOIs to date. The International Potato Center (CIP) was an early adopter of DOIs.

“CIP recognised the importance of DOIs in 2017 as a valuable method to keep track of each germplasm accession when sending plants worldwide, producing genebank catalogues, reporting to FAO Treaty, passport information exchange with other repositories, citing germplasm in research papers and labelling germplasm inventories for long term conservation,” says CIP database manager Edwin Rojas. “CIP is now DOI compliant, and DOIs are being recorded in our online GRIN-Global database.”

The Genesys portal, which brings together information on genebank materials, is also DOI compliant. DOIs are recorded in Genesys – providing additional information to support users in their queries. Genesys automatically exchanges information with the GLIS portal.

DOIs are the key to standardising the identification of germplasm, facilitating information sharing and ensuring more effective use of genebanks. To get started using DOIs in your genebank, visit