High Throughput Sequencing (HTS): Principle, Steps, Uses
High Throughput Sequencing (HTS), universally known as Next-Generation Sequencing (NGS) or Massively Parallel Sequencing, represents a revolutionary leap from traditional Sanger sequencing. Where the Sanger method was restricted by its low throughput and high cost for large-scale projects, HTS technologies enable the rapid and simultaneous sequencing of millions of DNA or RNA fragments. This capability has fundamentally transformed genomics, allowing researchers to study the entire genetic blueprint of an organism—the genome—or its entire set of transcribed RNA—the transcriptome—on an unprecedented scale.
The term ‘high-throughput’ directly refers to the technology’s ability to read a vast amount of DNA molecules in parallel, generating massive amounts of data in a single, highly efficient experiment. This dramatic increase in speed and efficiency has made complex analyses, such as whole genome sequencing (WGS) and whole exome sequencing (WES), accessible, moving them from niche research to mainstream clinical and agricultural applications.
The Fundamental Principle of Massive Parallelism
The core principle that distinguishes HTS from previous methods is massive parallelism. Instead of sequencing one long DNA fragment at a time, HTS breaks the entire genome into millions of short fragments, which are then sequenced simultaneously. The process is predicated on several key technological innovations:
First, all major HTS platforms rely on a step called **clonal amplification**. The fragmented DNA or RNA sequences must be amplified to create localized clusters of identical sequences. This step is necessary to ensure that the sequencing signal (whether it is light, pH change, or electrical current) is strong enough to be detected by the machine’s sensors. Methods like bridge amplification (used in Illumina) or emulsion-PCR (used in Ion Torrent) achieve this clonal clustering.
Second, the sequencing itself is a cyclical process. Bases are read one nucleotide at a time through progressive rounds of chemical reactions, detection, and signal clearance. This systematic, iterative reading process allows for highly accurate base calling across all clusters simultaneously.
The Sequential Steps of High Throughput Sequencing
The HTS workflow can be systematically divided into four major stages: Sample Preparation, Library Preparation, Sequencing, and Data Analysis.
The process begins with **Sample Preparation**, which involves isolating the genetic material (DNA or RNA) from the biological sample of interest. This step is crucial for ensuring the integrity and quality of the starting material.
Next is **Library Preparation**. The isolated nucleic acids are first fragmented into small pieces. Short, synthetic DNA sequences, known as **adapters**, are then attached to both ends of these fragments. These adapters are essential; they serve as universal priming sites for the subsequent amplification step, help in aligning the sequences later, and often contain unique molecular barcodes for multiplexing (pooling and sequencing multiple samples in one run).
The third phase is the **Sequencing** itself, where the prepared library is placed onto the sequencing platform, such as a flow cell (for Illumina) or a chip with microwells (for Ion Torrent). The fragments undergo clonal amplification and are then sequenced via different methodologies. For example, in Illumina’s **Sequencing by Synthesis** using cyclic reversible termination, fluorescently labeled nucleotides are incorporated, a picture is taken (imaging, as shown in relevant diagrams), and then the fluorescent tag is cleaved to prepare for the next cycle.
The final and most data-intensive step is **Data Analysis**. The sequencing platform generates vast amounts of raw data, which are then processed through a bioinformatics pipeline. This involves initial quality control (removing low-quality reads and adapter sequences), followed by **read alignment** (mapping and aligning the short sequence reads to a known reference genome) or *de novo* assembly (constructing a genome without a reference). The final stage of analysis involves identifying genetic variants, such as Single Nucleotide Polymorphisms (SNPs), structural variations, or quantifying gene expression patterns.
Key HTS Platforms and Technologies
Different commercial platforms utilize distinct sequencing chemistries, primarily falling under the umbrella of ‘sequencing-by-synthesis’ or ‘nanopore-based’ methods.
Illumina sequencing, the dominant technology, employs a **cyclic reversible termination** chemistry. Fluorescently-labeled, reversible terminators are added, one base is incorporated across all amplified clusters, the plate is imaged (a conceptual ‘diagram’ is often used to illustrate the four-color detection), and the terminator is cleaved to allow the next base incorporation. This method is highly accurate and scalable.
ThermoFisher’s Ion Torrent platform utilizes a **semiconductor sequencing** method. Instead of light detection, it detects the release of a hydrogen ion (H⁺) when a deoxynucleotide triphosphate (dNTP) is incorporated into a growing DNA strand. This H⁺ release causes a localized pH change, which is detected by an integrated sensor at the bottom of the microwell and converted into a voltage signal. This difference allows for base discrimination without the need for optical scanning.
Newer technologies, such as **Nanopore-based sequencing** (Oxford Nanopore Technology), achieve single-molecule, real-time sequencing. This method involves threading a DNA or RNA molecule through a tiny protein nanopore embedded in a membrane. As the nucleic acid passes through, each base causes a characteristic disruption in the electric current across the membrane, which is recorded and translated into a sequence. This technology uniquely enables the sequencing of exceptionally long DNA fragments.
Diverse Applications Across Biomedical Research and Beyond
The transformative power of HTS lies in its broad application across molecular biology, medicine, and agriculture.
In **Genomics**, HTS is essential for **Whole Genome Sequencing (WGS)**, providing a high-resolution view of the entire genetic makeup of an organism. **Whole Exome Sequencing (WES)**, which focuses on the protein-coding regions (exome), is a cost-effective alternative frequently used to identify genetic variants that contribute to diseases. **Resequencing**, which involves comparing a patient’s genome to a reference genome, is critical for identifying genetic differences like SNPs, aiding in the diagnosis of genetic disorders and cancer.
In **Transcriptomics**, methods like **RNA-Seq** are used to sequence RNA molecules. This allows researchers to quantify gene expression, identify alternative splicing events, and unravel post-transcriptional modifications, providing deep insight into what genes are active under different conditions, clinical states, or developmental phases.
The field of **Epigenetics** leverages HTS through assays like **ChIP-Seq** (Chromatin Immunoprecipitation Sequencing) and **ATAC-Seq** (Assay for Transposase-Accessible Chromatin with Sequencing). These methods map the binding sites of DNA-associated proteins (like transcription factors) and profile chromatin accessibility, respectively, which are key mechanisms for regulating gene expression without altering the DNA sequence itself.
Crucially, HTS plays a major role in **Clinical and Biomedical Research**. By providing detailed insights into cellular genomics and transcriptomics, it facilitates the identification and characterization of **biomarkers**—genes or proteins used for diagnosis, predicting patient response to treatment, or estimating prognosis. The technology is increasingly being adopted for *in-vitro* diagnostics for diseases like cystic fibrosis and is instrumental in the personalized medicine revolution, allowing for targeted therapies based on an individual’s unique genetic profile.
Conclusion
High Throughput Sequencing has become the cornerstone of modern biological research. Its advantages in scalability, speed, and cost-effectiveness over traditional methods have led to an exponential increase in genomic data. By offering a comprehensive and deep analysis of the genome, transcriptome, and epigenome, HTS continues to drive significant breakthroughs in understanding complex biological processes, elucidating disease pathways, and accelerating the development of new diagnostic tools and therapeutic interventions.