Skip to content

Tags: dimitribarbot/llama-cpp-python

Tags

v0.3.27

Toggle v0.3.27's commit message
fix(eval): prevent batch size from halving below 1 during KV slot exh…

…austion

- Added an explicit guard to break the dynamic batch downgrade loop when `current_batch_size` is exactly 1 and a Code 1 (No KV slot) is returned.
- Prevents the engine from executing an invalid `1 // 2` operation and generating the confusing "Halving batch size from 1 to 0" verbose log.
- Ensures the evaluation process fails fast and aborts gracefully when physical VRAM is completely depleted and no further fallback is mathematically possible.