Skip to content

Feature request: RSeQC split_bam.py - BED-file approach to rRNA quantification #111

Description

@rhassaine

Thanks for RustQC - the single-pass design is awesome!

Potential gap: the biotype % rRNA substantially undercounts rRNA on human GRCh38 given the GTF-file approach (vs RSeQC's split_bam.py BED-file read counting approach)

What's happening

On GRCh38 / GENCODE v38 STAR-aligned BAMs, RustQC's biotype % rRNA comes out ≈0 across all WTS samples, while RSeQC split_bam (reads overlapping an rRNA-region BED) reports several percent on the same BAMs. The featureCounts summary shows why:

  • Multi-mappers are dropped (featureCounts default). rDNA is high-copy repeat, so most rRNA reads multi-map → Unassigned_MultiMapping.
  • The 45S rDNA (18S/5.8S/28S) isn't annotated in GENCODE GRCh38 (it's in the unassembled acrocentric arms — only a few dozen rRNA genes exist, mostly 5S + pseudogenes), so uniquely-mapped rRNA reads → Unassigned_NoFeatures.

So biotype counting structurally can't see rRNA on a stock GRCh38/GENCODE setup; interval overlap catches it. The practical impact is that % rRNA reads near-zero when true residualrRNA is ~10%, so it can't be used to judge depletion.

What would be really awesome

  1. An interval/BED-based rRNA mode (like RSeQC split_bam) — most robust for rRNA quantification.
  2. Options to count multi-mapping/multi-overlapping reads in the biotype pass (the featureCounts -M/-O equivalents, which RustQC doesn't currently expose).
  3. At minimum, a docs note that biotype % rRNA under-reports where the rDNA isn't annotated?

RSeQC covers it but doesn't have the speed of the single-pass call in Rust. Happy to share details, test a fix or help in any way.

Environment: RustQC 0.2.1 · GRCh38 · GENCODE v38 · STAR, paired-end.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions