Minimap2 Fast DNA & RNA Sequence Aligner

Minimap2 is a state-of-the-art sequence alignment tool designed for aligning DNA or RNA sequences against large reference databases, such as genomes or transcriptomes. It is known for its speed, accuracy, and flexibility, making it suitable for handling both short reads (like from Illumina) and long reads (like from Oxford Nanopore or PacBio). Minimap2 can also perform spliced alignments, which makes it a great choice for RNA-Seq data analysis.

Key Features

🚀

Efficient Alignment

Minimap2 is optimized to align millions of reads quickly using minimal system resources.It outperforms many traditional aligners.

🔬

Support for Reads

It handles both long-read technologies like Oxford Nanopore/PacBio and short reads like Illumina.

🧠

Spliced Alignment

Minimap2 can align RNA sequences that span introns by detecting splice junctions.This is essential for analyzing.

📄

Compatible with SAM/BAM Output

Minimap2 outputs alignments in the widely used SAM/BAM format.This ensures seamless integration with tools like SAMtools.

Installation

Minimap2 can be installed using three common methods: building from source, using precompiled binaries, or via package managers. Below are detailed instructions for each approach.

Install from Source

This method is ideal for users who prefer to build the latest version manually or need full control over the build process.

Clone the repository:

  • git clone https://github.com/lh3/minimap2.git
  • cd minimap2

Build the executable:

  • make

Verify the installation:

  • ./minimap2 –version

Dependencies:

  • GCC (C compiler)
  • make utility
  • Unix-based system (Linux or macOS recommended)

Install Using Package Managers

This is the easiest and most user-friendly method, especially for those already using package managers like Conda or Homebrew.

Conda (via Bioconda):

  • conda install -c bioconda minimap2
  • Compatible with Linux and macOS
  • Automatically handles all dependencies

Homebrew (macOS/Linux):

  • brew install minimap2
  • Works on macOS (Intel & Apple Silicon) and Linux

Linux Package Managers:

As of now, Minimap2 is not available in the default repositories of APT (Ubuntu/Debian) or YUM (CentOS/Fedora). In such cases, you can:

  • Use Conda or Homebrew
  • Or download and run the precompiled binary
User Type Recommended Method
Beginners Conda / Homebrew
Developers Build from Source
Quick Setup Precompiled Binary

Usage and Basic Commands

Minimap2 works through the command-line interface (CLI). You give it a reference genome and the sequencing reads you want to align, along with some optional flags to control the behavior.

Command-Line Syntax

minimap2 [options] <ref.fa> <reads.fq>

Explanation:

  • minimap2: This is the executable file.
  • [options]: Various flags or settings like read type, number of threads, etc.
  • <ref.fa>: Reference genome in FASTA format (e.g., human genome).
  • <reads.fq>: Your sequencing data (usually in FASTQ format).
  • Important: Always make sure both files are in the correct format (FASTA or FASTQ).

Command-Line Syntax

minimap2 [options] <ref.fa> <reads.fq>

Explanation:

  • minimap2: This is the executable file.
  • [options]: Various flags or settings like read type, number of threads, etc.
  • <ref.fa>: Reference genome in FASTA format (e.g., human genome).
  • <reads.fq>: Your sequencing data (usually in FASTQ format).
  • Important: Always make sure both files are in the correct format (FASTA or FASTQ).

Minimap2 Common Modes (Presets)

Mode Usage Description
map-ont For Oxford Nanopore long reads.
map-pb For PacBio long reads (HiFi or CLR).
splice For spliced alignment, typically for RNA-Seq reads to a genome (includes intron handling).
asm5 / asm10 For aligning assembled genomes to other genomes (low divergence).

Example Commands

minimap2 -ax map-ont ref.fa reads.fq > aln.sam

  • -a: Output in SAM format (recommended for most use cases).
  • -x map-ont: Preset optimized for Oxford Nanopore reads.
  • ref.fa: Your reference genome.
  • reads.fq: Your nanopore or pacbio read file.
  • aln.sam: Output file.

Align Short Reads

minimap2 -ax sr ref.fa reads.fq > aln.sam

  • -x sr: Preset for short reads.
  • Mostly used for fast mapping (though BWA or Bowtie2 may be better for high-accuracy short reads).

Align Transcriptome Data (RNA-Seq)

  minimap2 -ax splice -uf -k14 ref.fa rna_reads.fq > aln.sam

  • -x splice: For spliced alignment of RNA-Seq.
  • -u: For unstranded RNA (you can change this for stranded RNA).
  • -f: Forces full-length read alignment.
  • -k14: Sets the minimizer k-mer size (14 is typical for RNA-Seq).
  • Output will include introns (gaps) in alignments.

Use Cases of Minimap2

Minimap2 is a versatile aligner, widely used in genomics and transcriptomics. Below are its major real-world applications:

Genome Assembly Polishing

What it means:

After assembling a genome (e.g., using long reads from Oxford Nanopore or PacBio), the result may contain errors like insertions, deletions, or mismatches.

How Minimap2 helps:

  • It maps raw reads back to the draft genome assembly.
  • Tools like Racon or Pilon use these alignments to identify and correct errors.
  • Improves the base-level accuracy of assemblies.

Example:

minimap2 -x map-ont assembly.fasta reads.fq > aln.sam

Structural Variant Calling

What it means:

Structural variants (SVs) are large changes in DNA like deletions, insertions, duplications, inversions, or translocations.

How Minimap2 helps:

  • Aligns long reads to a reference genome.
  • Since long reads can span SVs, the alignments clearly reveal structural changes.
  • Tools like Sniffles or SVIM use Minimap2 output to detect SVs.

Benefit:

Much better for SVs than short-read aligners like BWA or Bowtie2.

Transcriptome Mapping

What it means:

  • Transcriptome mapping involves aligning RNA-seq reads to a genome to study gene expression and splicing.

How Minimap2 helps:

  • With the -x splice preset, Minimap2 supports spliced alignment, meaning it can map RNA reads that cross exon-intron boundaries.
  • Works well with both long-read RNA-Seq (e.g., ONT direct RNA) and short-read data.

Example:

minimap2 -ax splice -uf -k14 ref_genome.fa rna_reads.fq > rna.sam

Use Case What Minimap2 Does Commonly Paired Tools
Genome Assembly Polishing Aligns reads to draft assemblies to fix errors Racon, Medaka, Pilon
Structural Variant Calling Detects large DNA changes via long-read alignment Sniffles, SVIM
Transcriptome Mapping Maps spliced RNA reads to genome StringTie2, TALON, FLAIR
Genome Comparisons Aligns entire genomes for evolutionary or structural study DotPlot, SyRI, Assemblytics

FAQ's

Minimap2 is a fast and efficient sequence alignment tool used for mapping DNA or RNA reads to a reference genome.

It was developed by Heng Li, a prominent bioinformatics researcher.

Yes, it is open-source and licensed under the MIT License.

Long-read alignment, short-read alignment, spliced RNA-Seq mapping, genome-to-genome alignment.

Clone the GitHub repository and run make in the directory, or use Conda/Homebrew.

It is not officially supported on Windows but can be run via WSL (Windows Subsystem for Linux).

Yes, run: conda install -c bioconda minimap2

It requires a C compiler like GCC or Clang to build from source, but no special libraries.

No, Minimap2 is command-line based only.

Input: FASTA/FASTQ; Output: SAM/BAM (via piping with samtools).

Use the map-ont preset:

minimap2 -ax map-ont ref.fasta reads.fq > output.sam

Use the map-pb preset:

minimap2 -ax map-pb ref.fasta reads.fq > output.sam

Yes, by providing both FASTQ files in the command.

Use the splice preset:

minimap2 -ax splice ref.fa rna_reads.fq > aln.sam

Yes, especially for long-read data and large genomes.

It depends on the genome size and read length, but generally uses low to moderate memory.

Yes, using the -t option (e.g., -t 8 for 8 threads).

Yes, with the -N option (e.g., -N 5 to limit to 5 secondary alignments).

Yes, it supports both gapped and spliced alignments.

It outputs alignments in SAM format.

Chaining is an internal process that helps identify the best regions of similarity before alignment.

Yes, it’s commonly used with tools like Racon or Medaka for this purpose.

Yes, using asm5, asm10, or asm20 presets for genome-to-genome alignment.

Not directly, but it can generate alignments used by SV callers like Sniffles.

Yes, it outputs in SAM format that samtools can process.

Absolutely, it’s compatible with Snakemake, Nextflow, and shell scripts.

No, it does not natively decompress .gz; use zcat or pigz with a pipe.

Yes, convert SAM to BAM and view it with IGV or similar tools.

Schema

Minimap2 - Fast Versatile DNA & RNA Sequence Aligner

Minimap2 is a fast, versatile aligner for mapping long noisy reads, short reads, RNA-seq, or assemblies to large reference genomes. #Minimap2

Price: Free

Price Currency: $

Operating System: Windows, macOS, and Linux

Application Category: Software

Editor's Rating:
4.3