Skip to content

Releases: NVIDIA/kvpress

v0.5.0

28 Jan 11:55
651bb0a

Choose a tag to compare

  • Upgrade to transformers v5 and fix broken tests associated with this upgrade #180

v0.4.3

27 Jan 15:53
68f8ef8

Choose a tag to compare

  • Minor updates to DMSPress and KVzapPress #177
  • Speedup of the CI/CD #177

v0.4.2

21 Jan 10:43
8b3c2f7

Choose a tag to compare

  • Rename ThresholdPress to DMSPress (#174)

v0.4.1

14 Jan 09:22
50c2ae5

Choose a tag to compare

✨ New Features

  • KVzapPress - a fast approximation of KVzip for prefill and decoding compression (https://arxiv.org/abs/2601.07891). Comes with KVzap training and evaluation utilities (#171)
  • ThresholdPress - adaptive compression using score thresholds instead of fixed compression ratios (#171)

📈 Improvements

  • Update KVzipPress with improvements and evaluation registry support (#172)
  • Rename compress-question to query-aware in evaluation config (#168)
  • Refactor ObservedAttentionPress for cleaner implementation (#166)
  • Add leaderboard generation script (#171)

🐛 Bug Fixes

  • Fix empty context handling in pipeline (#165)

v0.4.0

05 Dec 08:54
8306602

Choose a tag to compare

🚀 Release v0.4.0

✨ New Features

  • CURPress - Value-Guided KV Compression for LLMs via Approximated CUR Decomposition (#150)
  • CompactorPress - Compactor: Calibrated Query-Agnostic KV Cache Compression with Approximate Leverage Scores (#143)
  • Decoding Press Functionality - Support for KV cache compression during the decoding phase (#139)
  • AIME25 & Math500 Benchmarks - New evaluation datasets for mathematical reasoning tasks (#142)
  • post_init_from_model Hook - Add model-specific initialization support in BasePress (#163)

📈 Improvements

  • Moved tests to GPU for faster CI execution (#132)
  • Improved needle-in-haystack test coverage (#133)
  • Updated README and documentation for clarity (#162)
  • Enhanced docstrings throughout the codebase (#159)
  • Updated decoding notebook with latest examples (#156)
  • Code cleanup: moved utilities, cleaned imports (#160)

🐛 Bug Fixes

  • Fixed LongBench-v2 benchmark evaluation (#161)
  • Fixed kvzip press access to past_key_values
  • Fixed ComposedPress behavior (#148)
  • Fixed import issues (#144)

📦 Installation

pip install kvpress==0.4.0

📚 Full Changelog

v0.3.0...v0.4.0

v0.3.0

04 Sep 12:47
7dbd3f0

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.2.10...v0.3.0

v0.2.10

06 Aug 16:10
3eb3f92

Choose a tag to compare

What's Changed

Full Changelog: v0.2.9...v0.2.10

v0.2.9

28 Jul 12:39
52c761c

Choose a tag to compare

What's Changed

Full Changelog: v0.2.8...v0.2.9

v0.2.8

08 Jul 10:21
d3fb898

Choose a tag to compare

What's Changed

🐛 Bug Fixes

  • Fix failing tests by @maxjeblick in #94
    Reverts changes to CriticalKVPress performed in #90 that caused the press to initialize incorrectly. The PR also fixes some test logic.

Full Changelog: v0.2.7...v0.2.8

v0.2.7

07 Jul 16:52
2bc4e2e

Choose a tag to compare

What's Changed

🐛 Bug Fixes

  • Fix FinchPress for Qwen models family by @alessiodevoto in #82
    Resolved compatibility issues with Qwen model architecture in FinchPress compression

✨ New Features

  • Add KeyDiffPress and BlockPress by @figuremout in #86
    Introduces new compression methods based on key difference analysis
  • Fix for Qwen with Yarn by @giulio98 in #85
    Enable Yarn scaling in FinchPress and KeyRerotationPress

📚 Documentation & Maintenance

Full Changelog: v0.2.6...v0.2.7