abetlen / llama-cpp-python Public

Notifications You must be signed in to change notification settings
Fork 1.4k
Star 10.2k

Code
Issues 600
Pull requests 111
Discussions
Actions
Projects
Security and quality 1
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security and quality
Insights

Pull requests: abetlen/llama-cpp-python

Labels 24 Milestones 0

New pull request New

111 Open 472 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Add chat template for gemma models

#2183 opened Apr 13, 2026 by C00kieFact0ry

Loading…

fix suppress_stdout_stderr not working in Jupyter notebooks

#2181 opened Apr 13, 2026 by Anai-Guo

Loading…

fix: prevent KV cache corruption on SWA/ISWA models + hot-path perf

#2180 opened Apr 12, 2026 by avion23

Loading…

perf: vectorize KV cache prefix matching with numpy

#2179 opened Apr 11, 2026 by nausicaalii

Loading…

4 tasks done

build: disable soname to reduce binary size

#2177 opened Apr 9, 2026 by Bing-su

Loading…

feat(example): Updated server example (batch processing, /v1/responses api, response parsing)

#2174 opened Apr 5, 2026 by abetlen Owner

Loading…

fix: Add attribute check for sampler in close method

#2173 opened Apr 4, 2026 by usernames122

Loading…

fix(ci): avoid duplicate py3-none release builds

#2172 opened Apr 3, 2026 by abetlen Owner

Loading…

feat: add reasoning_effort to chat completions API

#2167 opened Mar 30, 2026 by abetlen Owner

Loading…

ci: refactor cpu wheel build workflow

#2164 opened Mar 26, 2026 by Bing-su

Loading…

fix: auto-disable mmap when all layers offloaded to GPU (#1964)

#2147 opened Mar 22, 2026 by ljluestc • Draft

Clear kv cache and reset tokens after chat completion

#2141 opened Mar 14, 2026 by thisisayushg

Loading…

This PR implements the previously stubbed state management methods in the _internals.py module and updates the corresponding API calls in llama.py to use the correct underlying C++ function names.

#2134 opened Mar 5, 2026 by bsides230

Loading…

feat: Add DeepSeek R1 and distilled model support

#2131 opened Mar 1, 2026 by ljluestc • Draft

feat: add streaming tool use (rebased #1884 on latest main)

#2129 opened Feb 23, 2026 by XyLearningProgramming

Loading…

chore: bump conda-incubator/setup-miniconda from v3.1.0 to v3.3.0

#2128 opened Feb 22, 2026 by Aiudadadadf

Loading…

fix: correct typos 'seperated' and 'seperator' to 'separated' and 'separator'

#2121 opened Feb 9, 2026 by thecaptain789

Loading…

fix: correct typo 'seperated' to 'separated'

#2120 opened Feb 7, 2026 by thecaptain789

Loading…

feat: support Granite-Docling model

#2109 opened Jan 4, 2026 by dhdaines

Loading…

Update to llama.cpp 2026-01-01

#2108 opened Jan 1, 2026 by avion23

Loading…

Fix issue #2096: Handle URLs with embedded HTTP credentials in _load_image

#2102 opened Dec 10, 2025 by nMaroulis

Loading…

chore: update typing-extensions dependency and set github actions setup-python to v6

#2099 opened Nov 28, 2025 by AnvithaCodes

Loading…

Fix: Install correct CUDA toolkit during build

#2088 opened Nov 12, 2025 by chamalgomes

Loading…

Fixed issue #1938

#2085 opened Nov 5, 2025 by TNing

Loading…

Include x64 directory for CUDA DLLs on Windows

#2083 opened Oct 24, 2025 by ajparsons

Loading…

Previous 1 2 3 4 5 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2026-03-13.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!