Misc. bug: Is this a backend VRAM tracking bug?

### Name and Version

version: 9512 (0dbfa66a1)
built with Clang 19.1.5 for Windows x86_64

### Operating systems

Windows

### Which llama.cpp modules do you know to be affected?

llama-cli

### Command line

```shell
llama-vulkan/llama-cli.exe -hf unsloth/Qwen3.6-27B-MTP-GGUF:Q5_K_XL -ngl all --fit on -lv 4 --no-mmap --fa on --cache-type-k x --cache-type-v x --spec-type draft-mtp --spec-draft-n-max 2 --spec-draft-type-k x --spec-draft-type-v x 
```

### Problem description & steps to reproduce

When going through iterations of KV cache and draft KV cache quantization, VRAM usage is not consistent across different backends.

HIP: 
   
    kv cache  | draft kv  | context
    f16/f16   | ntp       |   49408
    q8_0/q8_0 | ntp       |   82944
    f16/f16   | f16/f16   |   13312
    f16/f16   | q8_0/q8_0 |    4608
    q8_0/q8_0 | f16/f16   |   22784
    q8_0/q8_0 | q8_0/q8_0 |    8192

Vulkan:

    kv cache  | draft kv  | context
    f16/f16   | ntp       |   39936
    q8_0/q8_0 | ntp       |   74496
    f16/f16   | f16/f16   |    8960
    f16/f16   | q8_0/q8_0 |   16640
    q8_0/q8_0 | f16/f16   |   17408
    q8_0/q8_0 | q8_0/q8_0 |   31488 

According to https://github.com/ggml-org/llama.cpp/discussions/24102#discussioncomment-17175261 the HIP numbers are correct, but the Vulkan numbers are not.

### First Bad Commit

N/A

### Relevant log output

N/A


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misc. bug: Is this a backend VRAM tracking bug? #24159

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Misc. bug: Is this a backend VRAM tracking bug? #24159

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions