-
Notifications
You must be signed in to change notification settings - Fork 19.2k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
vulkan: check coopmat2 features before reporting support
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#24186
opened Jun 5, 2026 by
0cc4m
Contributor
Loading…
Add ROCmFP4 CPU quantization support
examples
ggml
changes relating to the ggml tensor library for machine learning
#24185
opened Jun 5, 2026 by
charlie12345
Loading…
Initial ET backend
documentation
Improvements or additions to documentation
examples
ggml
changes relating to the ggml tensor library for machine learning
python
python script changes
server
testing
Everything test related
#24179
opened Jun 5, 2026 by
marty1885
Loading…
1 task done
common/chat : unify and fix LFM2/LFM2.5 tool parser
testing
Everything test related
#24178
opened Jun 5, 2026 by
tdakhran
Contributor
Loading…
refactor: replace embd_normalize integer parameter with enum
examples
server
#24173
opened Jun 5, 2026 by
SaurabhCodes-16
Loading…
2 tasks done
fix issue 23977
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#24172
opened Jun 5, 2026 by
zihaomu
Loading…
mtmd : add Apple CoreML backend for vision encoding
examples
python
python script changes
#24163
opened Jun 5, 2026 by
tc-mb
Contributor
Loading…
opencl: improve get_rows, cpy, concat and q6_k flat gemv
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
ui: fix mobile chat form overflow and bust stale bundle cache
examples
server/ui
#24158
opened Jun 5, 2026 by
ServeurpersoCom
Contributor
Loading…
model-loader: add --reclaim-mmap-source to drop dormant mmap pages (Fixes #16761)
#24156
opened Jun 5, 2026 by
markkobo
Loading…
server : return HTTP 400 on invalid grammar (#24144)
examples
python
python script changes
server
#24154
opened Jun 5, 2026 by
Anuj-Attri
Loading…
Sycl --split-mode tensor
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#24152
opened Jun 5, 2026 by
Spruill-1
Loading…
server: add "schema" and validation
examples
server
#24150
opened Jun 4, 2026 by
ngxson
Contributor
Loading…
Fix/server prompt cache no consume on load
examples
python
python script changes
server
#24143
opened Jun 4, 2026 by
alainnothere
Loading…
[PoC] server: support requantizing kv cache
examples
server
#24134
opened Jun 4, 2026 by
wadealexc
Loading…
HIP: add gfx1152 and gfx1153 to RDNA3.5
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#24129
opened Jun 4, 2026 by
harkgill-amd
Loading…
CUDA: refactor MMQ kernel configuration
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#24127
opened Jun 4, 2026 by
JohannesGaessler
Contributor
Loading…
Add ctx-per-slot argument for unified KV cache
examples
server
#24124
opened Jun 4, 2026 by
bartowski1182
Contributor
Loading…
vulkan: add changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
v_dot2_f32_f16 support in matrix-matrix multiplication and Flash Attention
ggml
#24123
opened Jun 4, 2026 by
0cc4m
Contributor
Loading…
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.