Optimize the Nuc enum with bitwise operations and remove the lookup in trinucleotide#18
Optimize the Nuc enum with bitwise operations and remove the lookup in trinucleotide#18bmesuere wants to merge 2 commits intobenchmark/0from
Conversation
There was a problem hiding this comment.
Pull request overview
This PR optimizes the nucleotide (Nuc) representation to exploit bitwise operations and replaces the large pattern-matching trinucleotide lookup with a direct arithmetic calculation, yielding measurable performance improvements in the benchmarks provided.
Changes:
- Reworked
Nucto be a#[repr(u8)]enum with a documented bit layout (bit 0 as insertion flag, bits 1–2 as base encoding). - Implemented bitwise-based helpers on
Nuc(to_int,to_lower,is_insertion,rc) to replace large pattern matches. - Replaced the exhaustive
trinucleotidepattern-match table with a direct computed codon index based on base indices.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /// Nucleotide representation optimized for bit operations. | ||
| /// | ||
| /// Layout: bit 0 = insertion flag, bits 1-2 = base encoding | ||
| /// - A=0, Ai=1, C=2, Ci=3, G=4, Gi=5, T=6, Ti=7 |
There was a problem hiding this comment.
I'd not enumerate these here, they are right below in the definition.
|
|
||
| /// Convert to lowercase (insertion) variant | ||
| #[inline] | ||
| pub fn to_lower(&self) -> Nuc { |
There was a problem hiding this comment.
On its own, this change gives me a 1.11x on execution time (slower). I'm running only on NC_000913.fna though:
Benchmark 1: ./secondline -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
Time (mean ± σ): 1.557 s ± 0.102 s [User: 0.990 s, System: 0.560 s]
Range (min … max): 1.473 s … 1.754 s 10 runs
Benchmark 2: ./table -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
Time (mean ± σ): 1.725 s ± 0.024 s [User: 1.134 s, System: 0.581 s]
Range (min … max): 1.695 s … 1.767 s 10 runs
Benchmark 3: ./transmute -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
Time (mean ± σ): 1.736 s ± 0.026 s [User: 1.124 s, System: 0.603 s]
Range (min … max): 1.697 s … 1.786 s 10 runs
Summary
./secondline -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913 ran
1.11 ± 0.07 times faster than ./table -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
1.11 ± 0.07 times faster than ./transmute -s /data/programming/unipept/FragGeneScanRs/example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
(secondline is this version without this change, table is me trying out a constant table lookup, transmute is this version)
| let n0 = s[0].to_int()?; | ||
| let n1 = s[1].to_int()?; | ||
| let n2 = s[2].to_int()?; | ||
| Some(n0 * 16 + n1 * 4 + n2) |
There was a problem hiding this comment.
Bitshifts here give a minor improvement for me.
This PR optimizes the Nuc enum to use bitwise operations and replaces the trinucleotide function by direct calculation.
Benchmark on my laptop:
short reads: 532.2 ms ± 4.8 ms
complete genome: 695.9 ms ± 8.3 ms
long reads: 4.488 s ± 0.015 s