Skip to content

[feature] Vulkan Backend #23

@rice7th

Description

@rice7th

I suggest adding a Vulkan compute backend for Windows/Linux GPU inference, as Vulkan has proven itself being a reliable (just works™), and often already installed GPU Compute API (almost no dependencies). It is also cross platform, and supports any GPU, even older ones.
It tends however to perform slightly worse compared to CUDA/ROCm/Metal on certain aspects, but I think that it is still worth implementing it, also considering how much of a speedup it gives compared to the native BLAS backend.
Llama.cpp has had a great success with it so far.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions