The advent of cost-effective, high-throughput genomic sequencing technologies represents an important point of inflection in global public health and the prevention and control of infectious diseases. Public health departments throughout the world have significant experience in the development and standardization of new laboratory techniques and protocols. Unlike many precedent technologies, high-throughput pathogen sequencing, has significant data management, information technology and bioinformatics requirements, requiring significant new investments in infrastructure and workforce.
As microbial sequencing becomes more routine, access to flexible, sustainable bioinformatics capacity has increasingly become one of the most critical emerging needs for public health laboratories throughout the world. We see a tremendous opportunity to advance global public health through the development of an open source, community-supported ecosystem for bioinformatic software development, implementation, validation and support. Current models for public health bioinformatic software development are heavily fragmented, and rely heavily on government platforms and academic software development, often to the exclusion of interested contributors. Openness will help reduce many of the current barriers to sustainable bioinformatics capacity in public health by enabling broader participation, defining key gaps and shared priorities, improving modularity, scalability and interoperability of existing software, and developing a mechanism to help fund critical software and development gaps, and ensuring the sustainability of core software, databases and tools.
At present, there is no organized effort to champion the development of open source, reproducible bioinformatics, or to support the development of bioinformatic and data standards, architectures and methods for public health. A global, community-driven effort is desperately needed to help build consensus on technical solutions, to drive the establishment of best practices and reference models, and to provide a forum to debate and develop new standards for data exchange and bioinformatic development. In addressing these challenges, we see an incredible opportunity to advance global public health.
In March of 2019, we convened an international group of experts to explore this problem, and one of the key conclusions was the need for a new organization to help develop consensus standards and best practices. The Public Health Alliance for Genomic Epidemiology (PHA4GE) was established as a result of these key bioinformatics stakeholders recognizing the need to combine global efforts, and we have already begun to take some critical formative steps with the establishment of technical resources, an organizational charter, a code of conduct and a handful of focused working groups.
The Public Health Alliance for Genomic Epidemiology (PHA4GE) is a global coalition that is actively working to establish consensus standards; document and share best practices; improve the availability of critical bioinformatic tools and resources; and advocate for greater openness, interoperability, accessibility and reproducibility in public health microbial bioinformatics.
The Public Health Alliance for Genomic Epidemiology has the following objectives:
- Promote theestablishment of open source, standards-driven bioinformaticsplatform/ecosystem for public health.
- Establish a forum and process to develop consensus onrough architecture and infrastructure recommendations, overall microbial bioinformatic technical requirements, resource needs and gaps, bestpractices, potential execution models, funding priorities and governance.
- Determine how this effort would interact with existing public health bioinformatics landscape, and how tobest position and advocate for openness, reproducibility and consistentstandards.
- Develop processes for community-driven and managed standards and best practices for bioinformatics software development, infrastructure, APIs and data standards.
- Establish technical guidance and procedures for(meta)data management, data integration, data sovereignty and governance, including improved capture and integration of unstructured epidemiologicand clinical data, improved data visualization and reporting, and improveduse of structured language and ontologies.
- Provide public health community consensus input and feedback to international sequence repositories on the submission, query and retrieval of public health microbial sequence data.
PHA4GE is working to:
- reduce the barrier to entry for routine sequencing;
- promote standardization, portability and reproducibility of assays and workflows;
- advance the use of open data and open source in public health;
- improve surveillance and outbreak response capabilities;
- promote innovation, collaboration and development from public/private sector;
- foster the development and resiliency of the global public health bioinformatic workforce;
- enable global public health to adapt more rapidly to changing priorities and emerging threats;
- and empower more labs to analyze and govern their own data, regardless of resource status.
Stay tuned.