Stephane Budel
Home

The Diffusion Index

Long-Read Sequencing

10,395 PubMed papers. One curve. But long-read tells a different story than NGS or single-cell — the technology adopted without the usual novelty premium. Here’s what that looks like.

Papers analyzed

10,395

Peak score (2017)

29/1,000

Current score (2026)

2.3/1,000

Peak-to-floor decline

~13×

How many papers, and where they land

Annual long-read sequencing papers split into three tiers — Nature / Science / Cell, the rest of the top-tier specialist journals (Tier 1+2), and everything else. The log view keeps all three visible; the widening gap is the dilution the diffusion curve below captures as a ratio.

Nature / Science / Cell
Tier 1+2 (incl. top 3)
All papers

Log scale — each line is a count, so all three tiers stay visible despite spanning four orders of magnitude. The gap between the lines is the dilution: top-3 output barely moves while total volume explodes. 2006–2025 (2026 partial year omitted).

Top-tier share
Journal H-index
Innovators
Early Adopters
Early Majority
○ faded dots = N < 50 papers (noisy) · Y-axis capped at 50 — 2009 outlier shown with actual value (N=3)

Peak: 50.0 per 1,000 in 2015 · current (2026): 66.4 per 1,000 — approaching floor

A muted novelty premium

Long-read sequencing peaked at just ~29 per 1,000 papers in 2017 — the year PacBio Sequel launched and Oxford Nanopore was gaining real traction. Compare that to NGS (118/1,000), single-cell (155/1,000), or spatial (146/1,000). The gap is not a methodological artifact — it reflects how the technology entered the ecosystem. Long-read was always an extension of short-read NGS, not a paradigm replacement. Tellingly, that modest peak did not even arrive during the Innovators phase — by 2017 long-read had already crossed into Early Majority. It is the one platform here whose novelty premium never really had an Innovators-era spike to begin with.

Infrastructure, not discovery

Most long-read papers are assembly-focused — completing reference genomes, resolving structural variants, phasing haplotypes. These are important but not the kind of “we discovered something surprising about biology” papers that earn Nature or Cell. The telomere-to-telomere human genome (2022) was a landmark, but even it generated only a modest score uptick. The tool that makes references more complete gets less credit than the tool that enables new experiments.

PacBio vs. ONT: a duopoly

Platform detection in the underlying papers shows PacBio (2,928 mentions) slightly ahead of Oxford Nanopore (2,442). That reflects two genuinely different use cases: PacBio HiFi dominates in high-accuracy assembly and germline variant calling; ONT ultra-long reads dominate in structural variant and centromere resolution. Clinical adoption has historically favored PacBio for its lower error rate — but ONT’s real-time sequencing and lower cost are driving it into pathogen surveillance and bedside clinical testing.

Where the floor is

By 2022 the score had already collapsed to ~2–3 per 1,000 — well below where NGS bottomed out (0.8/1,000). Long-read is unusual: it reached the floor level while publication volume was still climbing steeply (from ~700 papers in 2020 to 2,100 in 2025). This is the signature of a mature tool that enables science without earning credit for it — the same pattern you see in bioinformatics or electron microscopy. Expect the score to stabilize at 2–4/1,000 as a permanent feature of the genomics toolkit.

The diffusion index family

Each technology follows the same scoring logic but a different peak height and timeline. Long-read’s flat curve is the outlier — and that tells you something about market positioning.

NGS

Peak: 118/1,000 (2008)

Floor: 0.8/1,000

● Complete

scRNA-seq

Peak: 155/1,000 (2016)

Floor: ~4/1,000

◕ Floor in sight

Spatial Tx

Peak: 146/1,000 (2020)

Floor: /1,000

◑ Mid-descent

Long-read

Peak: 29/1,000 (2017)

Floor: ~2/1,000

◕ Effectively floored

Where the papers come from

Share of long-read sequencing papers by first-author affiliation, 2013–2026. Parsed from 10,257 affiliations.

USA
China
Germany
UK
Rest of World not shown

Long-read is the most globally distributed sequencing platform in the index: rest-of-world institutions are the plurality every year (~45%), reflecting heavy output from Japan, Australia, and Europe alongside the big four. The US led early, in keeping with PacBio’s California origins; China crossed it in 2020, and the two now run neck-and-neck near 20–25% each. No single country dominates.

Methodology: Papers fetched from PubMed matching “long-read sequencing” OR “nanopore sequencing” OR “SMRT sequencing” OR “long-read genome” OR “long-read assembly” OR “single-molecule real-time sequencing”. Diffusion score = top-tier papers ÷ total papers × 1,000. Top 3 = Nature, Science, Cell. Journal tiers assigned locally using a curated list. Years with fewer than 50 papers shown as faded dots (statistically noisy). Data as of June 2026.