The Earth BioGenome Project (EBP)

The Earth BioGenome Project (EBP): Sequencing All Eukaryotic Life

The Earth BioGenome Project (EBP) is a monumental global collaboration with a singular and profoundly ambitious goal: to sequence, catalog, and characterize the genomes of all known eukaryotic species on Earth. This initiative, comprising animals, plants, fungi, and protists, effectively represents all complex life with a nucleus in its cells. Inspired by the transformative success of the Human Genome Project, the EBP was officially launched in 2018 as a decade-long endeavor. It is fundamentally an effort to create a vast, open-access digital library of life’s complete genetic blueprints, thereby establishing a new foundational infrastructure for biology in the 21st century.

Scope and Ambitious Goals of the EBP

The total number of known eukaryotic species is estimated to be approximately 1.5 to 1.67 million. Prior to the EBP, fewer than 0.2 percent of these species had their genomes sequenced, and even fewer were sequenced to a high, “reference-quality” standard. The EBP aims to close this knowledge gap by providing high-quality, chromosome-level reference assemblies—meticulously checked and assigned genetic maps—for every single one of these species. The project is projected to cost about $4.7 billion over its full duration, making it a true “biological moonshot” in terms of scale and investment.

The project is structured into three main phases. Phase I, currently in its final stages, has focused on creating high-quality reference genomes for at least one representative species from approximately 9,400 eukaryotic taxonomic families, alongside another 5,000 species of particular scientific or conservation interest. Phase II is planned to target all remaining genera, sequencing an additional 150,000 species, which will require a tenfold acceleration in sequencing pace to reach a rate of about 3,000 new genomes per month. Phase III will complete the mission by sequencing all remaining known species, highlighting the project’s phased and meticulously planned approach to covering the entire eukaryotic tree of life.

The Immense Significance and Anticipated Impact

The completion of the Earth BioGenome Project is anticipated to revolutionize our understanding of biology and address some of the most critical global challenges. First and foremost, the genetic data will provide unprecedented insight into the origin, evolution, and maintenance of biodiversity, helping to uncover the “rules of life”—the fundamental principles governing how biological complexity arose and the relationship between genotype and phenotype. It will allow scientists to map the complete family tree of life with ultimate precision, illuminating evolutionary relationships and helping to identify the vast number of 10 to 15 million previously unknown species believed to exist.

The second major area of impact lies in conservation. With Earth facing rapid biodiversity loss due to climate change and human activity, a digital library of life’s DNA sequences is viewed as a critical tool to generate effective conservation strategies. These genomes will support biomonitoring, help assess species’ ability to adapt to changing environmental conditions, and inform efforts to protect and restore global biota. Furthermore, the knowledge gained is expected to yield immense practical benefits for human welfare, including the discovery of new medicines, the identification of genes for improving agriculture and crop resilience, the development of new biomaterials and energy sources, and enhanced control of emerging infectious diseases, which is central to pandemic preparedness.

Methodology: From Sample to Reference Genome

The EBP’s methodology is a complex multi-step process that requires coordination across continents and disciplines. It begins with the crucial task of collecting specimens. Researchers must find and collect a high-quality sample from the target species, often working with conservationists, biologists, and Indigenous Peoples and Local Communities (IPLCs) who possess essential knowledge of local flora and fauna. Since genetic material is delicate, collected samples—which can range from a leaf of a common tree to a rare deep-sea organism—must be meticulously preserved in biobanks to prevent DNA degradation. Ethical collection and obtaining necessary permits are paramount to this initial step.

Once a sample is secured and logged with detailed metadata, the technical phase begins. DNA is carefully extracted from the specimen and broken into fragments. Advanced sequencing technologies are then used to read the long pieces of DNA, generating raw sequence data. A major computational challenge follows: assembling these millions of small fragments into a complete, contiguous, and accurate chromosome-scale genome sequence. This process, which can take approximately 100 processor-weeks for a mammalian-sized genome, requires powerful computers and cutting-edge assembly algorithms.

The final steps involve annotation, storage, and sharing. Annotation is the process of figuring out what the genes and other functional elements within the sequence actually do. All resulting data, from the raw sequences to the final annotated genomes, must be organized, safely stored, and shared openly with researchers worldwide through public domain databases, such as Ensembl. This open-access commitment ensures the data can be used globally for research and application, adhering to the FAIR principles (Findable, Accessible, Interoperable, and Reusable) and maximizing the impact of this shared resource.

Global Collaboration and Progress

The EBP functions as an umbrella initiative, providing a standardized framework and coordination for numerous large-scale, national, and regional genomics projects. Key affiliated projects include the Darwin Tree of Life (DToL), which aims to sequence all eukaryotic species in Britain and Ireland; the European Reference Genome Atlas (ERGA), focusing on European flora and fauna; the Insect 5,000 Genomes (i5k) project; and the Bird 10,000 Genomes (B10K) project. The EBP boasts a rapidly growing network of over 2,200 scientists in 88 countries, reflecting its truly global, decentralized nature.

As of late 2024, the EBP and its partners have successfully generated genomes for approximately 3,000 species from over 1,000 taxonomic families, with a significant portion of those assemblies already reaching the high-quality chromosome level. To meet the goal of sequencing all known species by 2035, the project has announced a significant acceleration in its efforts. This ambitious timeline highlights the need for global coordination to standardise approaches and avoid duplication, ensuring that the technology needed to generate high-quality reference genomes is both improved and increasingly accessible to all member projects.

Ethical Considerations and Future Challenges

Despite the remarkable progress, the EBP faces significant ethical, legal, and social issues (ELSI), along with technical challenges. The most prominent ELSI concerns revolve around Access and Benefit Sharing (ABS), especially in compliance with international treaties like the Nagoya Protocol. EBP researchers must navigate complex and often differing jurisdictions, ensuring that sample collection from transnational species and from the lands of Indigenous Peoples and Local Communities is done ethically, legally, and with respect for sovereign and communal rights. Given the history of bioresource depletion on native lands, this is a particularly sensitive and critical component of the project’s operations.

On the technical front, the sheer scale of the data storage and processing requirements is massive, demanding constant refinement of computational infrastructure and bioinformatics algorithms. The logistical hurdle of sample collection—especially for rare, microscopic, or difficult-to-access organisms like those deep in the ocean or in extreme environments—remains a persistent and expensive challenge. Overcoming these hurdles will require continued innovation in areas like remote sensing, drone technology, and sample preservation, ensuring the project can equitably and effectively capture the world’s biodiversity.

A New Foundation for the Future of Biology

The Earth BioGenome Project is more than a scientific undertaking; it is an urgent act of preservation and a critical investment in the future. By generating a digital library of life’s biological blueprints, the EBP provides the essential infrastructure necessary to understand and respond to the escalating global crises of climate change and biodiversity loss. This foundation will empower future generations of scientists to make discoveries that sustain human societies, advance medicine and agriculture, conserve the planet’s diverse ecosystems, and allow us to become better-informed custodians of life on Earth.

Leave a Comment