MalariaGen: A call for open malaria parasite data
MalariaGEN announces a call for open malaria parasite data. MalariaGen invites researchers, institutions, and data owners to join us in this global effort.
As a collective effort by partners from more than 40 countries, the MalariaGEN community has been building genomic data resources to answer questions on how genome variation in human, Plasmodium and Anopheles populations impacts the spread and persistence of malaria.
MalariaGEN resources have been growing rapidly thanks to the concerted efforts of partners who collect, sequence, and collate genomic data. The latest parasite data release in the series, Pf7, is an open dataset of Plasmodium falciparum genome variation in 20,000+ worldwide samples from 33 malaria-endemic countries collected over 3 decades. The upcoming Pf8 release promises to further expand these resources through an even richer repository of parasite genomic data.
To accelerate progress in addressing malaria, there is an increasing need for more comprehensive and diverse genomic data on parasites and vectors spanning a wider geographic range across time. Such data enable researchers, epidemiologists, and public health officials to fill knowledge gaps faster and more effectively.
MalariaGEN team will be identifying data that has already been published and is available on the European Nucleotide Archive (ENA) with suitable metadata to include in the data releases. We will work with authors and data owners if we plan to include their data.
Researchers, institutions, and data owners are invited to join us in this global effort. Specifically, whole-genome sequencing data for Plasmodium falciparum and Plasmodium vivax that meet the following criteria is sought:
- Availability: Data is deposited on the European Nucleotide Archive (ENA) or Sequence Read Archive (SRA)
- Metadata: This should include the collection date and location, ideally the exact date and GPS coordinates of the health facility or administrative division. As a minimum, the year and country of collection must be present.
- Data: Whole genome sequence data, ideally with a typical coverage of at least 30x. While Illumina sequencing is ideal, other technologies will be considered, particularly long-read sequence technologies such as ONT or PacBio.
Read here for more information.
Contribution deadline, 3rd of March, 2025.
