Skip to content

Saved State on Disk is very large #1303

@tc-wolf

Description

@tc-wolf

Is your feature request related to a problem? Please describe.
When using Llama.save_state, the size on disk is very large, partially because Llama.scores is always of size (n_ctx, n_vocab), even when the number of tokens actually used is much less than this.

Describe the solution you'd like

  • Only store the number of tokens used by the model (self.n_tokens).
  • If possible, re-use logits from the llama_context struct instead of serializing scores.
    • These didn't match when I loaded both and compared, though didn't have logit processor callbacks.

Describe alternatives you've considered

  • Somehow compressing LlamaState
  • Re-using logits from C++ state saving functions and eliminating scores on Python model
    • Looks like this is needed for sampling logic

Partially fixed by #1296

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions