Allow changing k parameter of SampleTopK as a compiler flag#97
Allow changing k parameter of SampleTopK as a compiler flag#97copybara-service[bot] merged 1 commit intogoogle:devfrom
SampleTopK as a compiler flag#97Conversation
|
This is alright for now - but full disclosure - kSeqLen was a bit of a hack (in a bad way) that I'd like to get rid of sooner rather than later. It's very tempting to couple code to it as a global variable. I think the general direction in the future is to move most configuration (except for a very select few items) from comptime constants to runtime variables for more flexibility and not having to run cmake to change something about the inference and to more flexibly combine different model capabilities in a single application. One mitigation is to try to write code to still take top-k parametrically in functions (especially low-level ones) and avoid the temptation to reference kTopK (or any of these comptime constants) directly except at the very high-level application code, where it can be more easily replaced by a runtime value. |
Agreed, when I read this part of the code I was wondering why TopK uses compile-time parameters instead of in |
|
Yes, one distinction b/w InferenceArgs and RuntimeConfig - in general command line arguments are specific to an application/frontend so we are trying to keep those concerns (argument information, data validations, error handling) decoupled from the core library. One can think of |
Just add a compiler flag like
GEMMA_MAX_SEQLEN.