Health & Medical Health & Medicine Journal & Academic

Open Access Data Online to Advance Malaria Research

Open Access Data Online to Advance Malaria Research

Background


The Malaria Atlas Project (MAP) was established in 2005 to provide evidence-based estimates of populations at malaria risk using a cartographic approach. Then and now, parasite prevalence surveys that measure the proportion of a community with parasites in their blood (commonly referred to as PR or parasite rate surveys) provide the bulk of the global information available on malaria endemicity. In the intervening years, the MAP group conducted extensive searches of the published literature including peer reviewed journal articles and published reports, and contacted hundreds of malaria control programmes, research groups, Ministries of Health and aid agencies to request access to their survey data. All survey results received were disaggregated to individual sites, individual dates and individual parasite species; duplicates were excluded and a precise geoposition was calculated for each site (where it was not provided with the data). In 2010, the resulting database contained clean and geopositioned Plasmodium falciparum prevalence survey records for 22,249 unique site-date combinations. An interim dataset was used to produce MAP's 2007 global map of P. falciparum endemicity and the full dataset was used for the 2010 version, which incorporated vastly improved methods, as well as the addition of covariates. These data have also been used to undertake national analyses tailored to individual country needs, for example for Somalia, Kenya, Indonesia, Djibouti and Sudan. Since 2010, new data have been collated and geopositioned and the current total is 24,210 survey records. Of these, 9,970 were used in MAP's modelling work to estimate the global endemicity of Plasmodium vivax malaria.

From the outset, MAP undertook to ensure all data output by MAP's models would be made available in the public domain and details of the scripts used to run MAP's models were made available on GitHub under a GNU Public Licence for open source code. A large portion (80%) of the parasite survey data collated was unpublished, and none of these data had been generated by MAP, so a new exercise was initiated to seek data release permissions from the original data sources.

In addition, MAP launched a parallel exercise to collate anopheline mosquito surveys that reported the occurrence and/or absence of the dominant P. falciparum malaria vector species. These data were almost exclusively collated from published, peer-reviewed journal articles. Again survey results were disaggregated to individual sites, individual dates and individual species (or species complex/subgroup); duplicates were excluded and a precise geoposition was calculated for each site (where it was not provided with the data). This exercise yielded survey results for 30,324 site-date-species combinations extracted from 2,060 published articles. These data were used to estimate the spatial distribution of 41 dominant vector species in three distinct regions of the world.

A suite of work on human genetic variants of relevance to malaria endemic countries is also underway and MAP has committed to making the survey data collated for this work available online. This release is described fully in a separate publication and is summarized here.

This article describes both the survey data and the modelled data that can now be found online and outlines the mechanisms of release that were developed with the aim of making the datasets accessible and readily located, intelligible, assessable and useable.

Leave a reply