[feature] Vulkan Backend

I suggest adding a Vulkan compute backend for Windows/Linux GPU inference, as Vulkan has proven itself being a reliable (just works™), and often already installed GPU Compute API (almost no dependencies). It is also cross platform, and supports any GPU, even older ones.
It tends however to perform slightly worse compared to CUDA/ROCm/Metal on certain aspects, but I think that it is still worth implementing it, also considering how much of a speedup it gives compared to the native BLAS backend.
Llama.cpp has had a great success with it so far.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature] Vulkan Backend #23

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[feature] Vulkan Backend #23

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions