Skip to content

Update dependency ggml-org/llama.cpp to v9566#222

Open
renovate[bot] wants to merge 1 commit into
mainfrom
renovate/ggml-org-llama.cpp-9566.x
Open

Update dependency ggml-org/llama.cpp to v9566#222
renovate[bot] wants to merge 1 commit into
mainfrom
renovate/ggml-org-llama.cpp-9566.x

Conversation

@renovate

@renovate renovate Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

ℹ️ Note

This PR body was truncated due to platform limits.

This PR contains the following updates:

Package Update Change
ggml-org/llama.cpp major b9066b9566

Release Notes

ggml-org/llama.cpp (ggml-org/llama.cpp)

vb9566

Compare Source

Details

graph: guard iswa kq_mask on its own buffer (#​24294)

A SWA-only draft head (e.g. StepFun MTP) leaves the base sub-cache
empty, so its kq_mask buffer stays null and asserts at load. Guard
each mask on its own buffer in set_input and can_reuse, base and swa.

Co-authored-by: Georgi Gerganov ggerganov@gmail.com

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

vb9565

Compare Source

Details

[ggml-webgpu] Handle buffer overlap / buffer aliasing for concat operator (#​24000)

  • Only run webgpu CI on my fork

  • Add webgpu only workflow

  • handle buffer overlap case for concat operator

  • restore build-webgpu.yml

Co-Authored-By: Claude Sonnet 4.6 noreply@anthropic.com

  • Run clang-format

  • Update ggml/src/ggml-webgpu/wgsl-shaders/concat.wgsl


Co-authored-by: Claude Sonnet 4.6 noreply@anthropic.com
Co-authored-by: Reese Levine <reeselevine1@​gmail.com>

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

vb9564

Compare Source

Details

[ggml-webgpu] Implement 2D workgroups for scale, binary, and unary ops (#​24044)

  • Only run webgpu CI on my fork

  • Add webgpu only workflow

  • Implement 2d workgroups for more operations

  • fix

  • Fix type

  • Move back to global_invocation_id

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

vb9562

Compare Source

Details

mtmd : add video input support (#​24269)

  • wip

  • ok: lazy bitmap API

  • remember to free lazy text

  • wip

  • add mtmd_helper_video

  • support video input on server (base64 input)

  • add MTMD_VIDEO config

  • add timestamp

  • update CLI

  • cli: allow auto-completion for video

  • add --video arg

  • fix build

  • update docs

  • rename as suggested

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

vb9561

Compare Source

Details

sync : ggml

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

vb9559

Compare Source

Details

cli: fix spinner not show during prompt processing (#​24283)

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

vb9558

Compare Source

Details

vulkan: Use cm2 decode_vector for mul_mat_id B matrix loads (#​23991)

This allows vec4 loads of the B elements. Also increase BK to 64 when this is
enabled. Neither of these alone is consistently faster, but together these give
a nice speedup.

In ggml-vulkan.cpp, we need to make sure the B matrix alignment and stride are
multiples of 4.

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

vb9557

Compare Source

Details

cuda: reset cuda context after reading memory size (#​23935)

  • cuda: reset device in get_memory function if no backend is active

  • also count device and host buffers

  • exclude hip and musa from counting and device reset

  • use device mutex instead of atomic

  • undo backend_free function move

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

vb9556

Compare Source

Details

HIP: add gfx1152 and gfx1153 to RDNA3.5 (#​24129)

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

vb9555

Compare Source

Details

metal : fix im2col 1D case (audio models) (#​24220)

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

vb9553

Compare Source

Details

common : relax sampler name matching (#​23744)

  • common : relax sampler name matching

Currently, in some cases, the alternative names for samplers (like
top-k and min-p instead of the canonical top_k and min_p) are
not always recognized by the common_sampler_types_from_names function
in common/sampling.cpp.

This PR changes the signature of this function to remove the bool allow_alt_names flag, and removes all occurences of the flag from call
sites. Therefore, the function will now always match all known names.

I also changed the logic of the function to unconditionally check the
provided sampler names against both the canonical and alternative names,
and to be case-insensitive.

This fixes an issue I was seeing wherein samplers specified in the
llama-server UI were not recognized as valid when the alternative
names were used.

  • add more alt names

  • cont. fix

  • cast to unsigned char for correctness

  • common : unify sampler name mapping

  • annotate canonical vs. alt sampler name mappings per @​CISC

  • Update common/sampling.cpp

Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

  • common : auto-generate sampler name aliases per @​ngxson

  • use merged map for matching

  • use .merge instead of iterating

  • nit: simplify comment

  • nit: use insert everywhere, not index assignment


Co-authored-by: Sigbjørn Skjæret sigbjorn.skjaeret@scala.com

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

vb9551

Compare Source

Details

kv-cache : avoid kv cells copies (#​24277)

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

vb9550

Compare Source

Details

kv-cache: follow the source cache size when sharing cells (#​24267)

A fitted target context can end up smaller than the draft default, the
oversized assistant views then overflow the shared K/V tensors and trip
the ggml_view_4d size assert during graph reserve.

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

vb9549

Compare Source

Details

llama : add Gemma4 MTP (#​23398)

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

vb9548

Compare Source

Details

spec : fix vocab compatibility check (#​24256)

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI:

vb9547

Compare Source

Details

arg: Skip mmproj download when user supplied mmproj (#​24239)

macOS/iOS:

Note

PR body was truncated to here.


Configuration

📅 Schedule: (UTC)

  • Branch creation
    • At any time (no schedule defined)
  • Automerge
    • At any time (no schedule defined)

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants