Genesys, the online portal for information on genebank samples, is constantly evolving. First launched in 2008, it now contains data on more than four million accessions and counting, which is over half of the estimated total holdings of the world’s genebanks. Recent developments on the platform have made it easier for users to sift through all this information to find exactly what they need.
Matija Obreza, the person responsible for Genesys at the Crop Trust, has visited the 11 CGIAR genebanks as well as numerous other partner organizations to work with staff to ensure the latest data is uploaded and to get feedback on Genesys’s development.
“All 11 CGIAR genebanks already have good passport data on Genesys,” said Matija. “This includes key information like accession number, species name and the origin of material. Genebank data curators have been diligent in updating their records on Genesys.”
Three big enhancements to Genesys now go beyond such basic passport data. “The portal now allows genebanks to publish phenotypic datasets, subsets of accessions, and images,” said Matija. “During the visits we discuss the new features and work with genebank staff to prepare the data for publication.”
Characterization and evaluation datasets
Genebank staff and their partners routinely grow accessions in the field and record their morphological and agronomic attributes. Such datasets provide researchers and breeders with useful information on which to base requests. Genesys can now handle such datasets, complementing the passport data that has been its bread and butter up to now.
“Passport data doesn’t tell users what the accession looks like and how it performs,” Matija said. “So we’ve built functionality into Genesys to allow genebanks to upload characterization and evaluation datasets too. Genebanks actually focus more on the ‘characterization’ part rather than ‘evaluation,’ which is done by crop specialists. The genebanks will usually characterize the plants when they grow them in the field. They frequently take pictures of the entire plant, leaves, flowers, seeds and fruits.”
The groundwork for this feature started with a project funded by Germany’s Federal Office for Agriculture and Food (BLE). That helped set up the infrastructure required to manage diverse datasets from disparate sources in a way that would allow anyone to understand them.
The actual uploading of the datasets to Genesys is the easy part. “The genebank staff sometimes have to become archaeologists,” Matija said. “Often data is archived and forgotten, especially when staff has retired. Once they find some relevant data, they have to make sense of it: the data requires cleaning before publication on Genesys. Characterization draws on published descriptor lists, but genebanks will often use their own customized descriptors, so we document and publish those too.”
Genesys now makes available more than 400 characterization and evaluation datasets on a wide range of material from cultivated Bambara groundnut to wild rice species. The process of documenting and uploading datasets has been streamlined and is available to all interested genebanks.
CGIAR genebanks distribute around 100,000 germplasm samples every year. Many prospective genebank users go to Genesys to search for the samples they need. But the information available is all too often still limited to passport data, which means that requesters often err on the side of ordering more material than they need, just to be on the safe side. That’s not very efficient.
Subsets are carefully curated lists of accessions with specific characteristics. A subset can be all wheat varieties that are good for making tortillas, or cowpeas that known to be drought-resistant, or high vitamin D cassava varieties, or the most distributed potatoes. They help users by pre-packaging and presenting groups of accessions that are likely to correspond with their needs.
“Creating a subset is like telling a story,” Matija said. “Curators have a lot of stories about accessions in their collection, and we try to document and share those as subsets. For example, ICRAF has a group of accessions which are considered to be particularly fast-growing timber species. Rather than scouring the literature and then searching individually for accessions, or asking the genebanks manager, a user can just ask for the subset with one click.”
The genebanks have plenty of stories to tell. Genesys now offers more than 100 subsets.
Genesys also enables genebanks to share photographs and other image documentation on their accessions, such as scanned collecting forms.
“A picture is worth a thousand words even for genebank accessions,” Matija said. “You want to see what things look like. Data is useful and you can write about an accession in a dataset. But an image provides information that is hard to capture otherwise. People looking at rice or bean photos can pick things out visually that are difficult to describe in words. Users can make more informed decisions based on images. We help genebank staff organize existing photo archives and link the images to the accessions.”
Genesys now makes available 160,000 images of 115,000 accessions from nine CGIAR genebanks.
Always a work in progress
The recent improvements to Genesys make it easier for users to find the accessions they need and help reveal the enormous diversity of genebank collections. “But data management evolves all the time, and documentation work never ends,” Matija said.