Conversation
There was a problem hiding this comment.
Pull request overview
Replaces manual Vec::with_capacity + push loops with vec![value; count] initialization for the alpha and path buffers in the Viterbi forward pass.
Changes:
- Pre-initialize
alphaasvec![[0.0; State::COUNT]; seq.len()]. - Pre-initialize
pathasvec![[Some(State::S); State::COUNT]; seq.len()]. - Remove the manual
for _ in 0..seq.len()initialization loop.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let mut alpha: Vec<[f64; hmm::State::COUNT]> = vec![[0.0; hmm::State::COUNT]; seq.len()]; | ||
| let mut path: Vec<[Option<hmm::State>; hmm::State::COUNT]> = | ||
| vec![[Some(hmm::State::S); hmm::State::COUNT]; seq.len()]; |
There was a problem hiding this comment.
forward() allocates alpha/path using seq.len(), but the function later unconditionally indexes alpha[0] and seq[0..=2] (and writes to alpha[1]/alpha[2]). Since the caller currently only guards is_empty(), inputs with length 1–2 will panic. Consider adding an early return/guard for seq.len() < 3 (either here or in viterbi()/caller) so short records are handled safely.
There was a problem hiding this comment.
This is a valid comment, but was also valid for the original code. I fixed it at the caller side in FragGeneScanRs.rs.
ninewise
left a comment
There was a problem hiding this comment.
Doesn't seem to make a time difference. Whichever way you find more readable.
Benchmark 1: ./benchmark-1 -s example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
Time (mean ± σ): 1.469 s ± 0.026 s [User: 0.918 s, System: 0.543 s]
Range (min … max): 1.432 s … 1.509 s 10 runs
Benchmark 2: ./benchmark-2 -s example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
Time (mean ± σ): 1.487 s ± 0.103 s [User: 0.927 s, System: 0.553 s]
Range (min … max): 1.351 s … 1.587 s 10 runs
Summary
./benchmark-1 -s example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913 ran
1.01 ± 0.07 times faster than ./benchmark-2 -s example/NC_000913.fna -t complete -w 1 -o /tmp/asdf/NC_000913
This pull request replaces manual vector initialization with
vec![value; count]foralphaandpath.Benchmark on my laptop:
short reads: 450.7 ms ± 7.4 ms
complete genome: 626.8 ms ± 3.9 ms
long reads: 3.937 s ± 0.009 s
See #17 for a comparison.