Tags · CodeLinaro/llama.cpp

b6775

gguf-py : add support for endian conversion of BF16 data (ggml-org#16594

)

BF16 requires special handling in this script
while it's a 2-bytes data, but view is 1-byte by default.
Switch to correct view before attempting byteswapping.

With this change correctly byteswapping models like
Meta-Llama-3-8B-Instruct-bf16-GGUF
should be possible.

Oct 15, 2025
7adc79c
zip
tar.gz
Downloads

b6745

metal : add opt_step_adamw and op_sum (ggml-org#16529)

* scaffold to support opt step adamw on metal (not written so far)

* add opt-step-adamw kernel for metal

* pass op->src[4] as a separate buffer to the pipeline

* add bounds check to opt-step-adamw kernel

* complete scaffold for GGML_OP_SUM

* naive GGML_OP_SUM kernel

* remove unwanted comment

* change OP_SUM capability gate

* Add has_simdgroup_reduction to both ops to pass CI

Oct 12, 2025
a31cf36
zip
tar.gz
Downloads

b6725

webui: updated the chat service to only include max_tokens in the req… (

ggml-org#16489)

* webui: updated the chat service to only include max_tokens in the request payload when the setting is explicitly provided, while still mapping explicit zero or null values to the infinite-token sentinel

* chore: update webui build output

Oct 9, 2025
1faa13a
zip
tar.gz
Downloads

b6713

server : fix cancel pending task (ggml-org#16467)

Co-authored-by: DevAI <DevAI@gmail.com>

Oct 8, 2025
d2ee056
zip
tar.gz
Downloads

b6700

llama : add --no-host to disable host buffers (ggml-org#16310)

* implement --no-host to disable host buffer

* fix equal_mparams

* move no-host enumeration order together with other model params

---------

Co-authored-by: slaren <slarengh@gmail.com>

Oct 6, 2025
3df2244
zip
tar.gz
Downloads

b6664

CI: reenable cdna in rocm docker builds (ggml-org#16376)

Oct 1, 2025
c8dedc9
zip
tar.gz
Downloads

b6661

ci: Properly install rocwmma for hip builds (ggml-org#16305)

* CI: Properly install rocwmma for hip builds

on windows we now windows install rocwmma from ubuntu pacakges

* CI: update linux rocm docker build to use rocm 7.0

Oct 1, 2025
1fe4e38
zip
tar.gz
Downloads

b6550

ggml : implement set_rows with i32 index (ggml-org#16159)

* implement set_rows with i32 index

* template fix

* test quantized path

warnings--

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* forgotten name change

* deduplicate cuda/sycl and test-fix

* indent++

* vulkan: support set_rows with i32 index type (ggml-org#16162)

* disable i32 index for webgpu for now

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Jeff Bolz <jbolz@nvidia.com>

Sep 22, 2025
3ecb2f6
zip
tar.gz
Downloads

b6451

ggml-backend : add GGML_BACKEND_DEVICE_TYPE_IGPU device type (ggml-or…

…g#15797)

* ggml-backend : add GGML_BACKEND_DEVICE_TYPE_IGPU device type

ggml-backend : add device id to device props

llama : only use iGPU devices if there are no GPU devices

llama : do not use multiple devices from different backends with the same device id

Sep 11, 2025
360d653
zip
tar.gz
Downloads

b6423

json : support `enum` values within `allOf` (ggml-org#15830)

Sep 8, 2025
7057faf
zip
tar.gz
Downloads

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

b6775

b6745

b6725

b6713

b6700

b6664

b6661

b6550

b6451

b6423

Tags: CodeLinaro/llama.cpp