srv update_slots: all slots are idle
srv params_from_: Chat format: peg-native
slot get_availabl: id 1 | task -1 | selected slot by LCP similarity, sim_best = 0.965 (> 0.100 thold), f_keep = 0.967
slot launch_slot_: id 1 | task -1 | sampler chain: logits -> ?penalties -> ?dry -> ?top-n-sigma -> top-k -> ?typical -> ?top-p -> ?min-p -> ?xtc -> temp-ext -> dist
slot launch_slot_: id 1 | task 93797 | processing task, is_child = 0
slot update_slots: id 1 | task 93797 | new prompt, n_ctx_slot = 65536, n_keep = 0, task.n_tokens = 9287
slot update_slots: id 1 | task 93797 | n_past = 8962, slot.prompt.tokens.size() = 9267, seq_id = 1, pos_min = 9266, n_swa = 0
slot update_slots: id 1 | task 93797 | Checking checkpoint with [8886, 8886] against 8962...
slot update_slots: id 1 | task 93797 | restored context checkpoint (pos_min = 8886, pos_max = 8886, n_tokens = 8887, n_past = 8887, size = 149.626 MiB)
slot update_slots: id 1 | task 93797 | n_tokens = 8887, memory_seq_rm [8887, end)
slot update_slots: id 1 | task 93797 | prompt processing progress, n_tokens = 9283, batch.n_tokens = 396, progress = 0.999569
slot update_slots: id 1 | task 93797 | n_tokens = 9283, memory_seq_rm [9283, end)
reasoning-budget: activated, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
slot init_sampler: id 1 | task 93797 | init sampler, took 0.99 ms, tokens: text = 9287, total = 9287
slot update_slots: id 1 | task 93797 | prompt processing done, n_tokens = 9287, batch.n_tokens = 4
slot create_check: id 1 | task 93797 | created context checkpoint 8 of 64 (pos_min = 9282, pos_max = 9282, n_tokens = 9283, size = 149.626 MiB)
srv log_server_r: done request: POST /v1/chat/completions 127.0.0.1 200
reasoning-budget: deactivated (natural end)
slot print_timing: id 1 | task 93797 |
prompt eval time = 2374.07 ms / 400 tokens ( 5.94 ms per token, 168.49 tokens per second)
eval time = 10536.28 ms / 111 tokens ( 94.92 ms per token, 10.54 tokens per second)
total time = 12910.35 ms / 511 tokens
slot release: id 1 | task 93797 | stop processing: n_tokens = 9397, truncated = 0
srv update_slots: all slots are idle
srv params_from_: Chat format: peg-native
slot get_availabl: id 1 | task -1 | selected slot by LCP similarity, sim_best = 0.928 (> 0.100 thold), f_keep = 1.000
slot launch_slot_: id 1 | task -1 | sampler chain: logits -> ?penalties -> ?dry -> ?top-n-sigma -> top-k -> ?typical -> ?top-p -> ?min-p -> ?xtc -> temp-ext -> dist
slot launch_slot_: id 1 | task 93910 | processing task, is_child = 0
slot update_slots: id 1 | task 93910 | new prompt, n_ctx_slot = 65536, n_keep = 0, task.n_tokens = 10121
slot update_slots: id 1 | task 93910 | n_tokens = 9397, memory_seq_rm [9397, end)
slot update_slots: id 1 | task 93910 | prompt processing progress, n_tokens = 9609, batch.n_tokens = 212, progress = 0.949412
slot update_slots: id 1 | task 93910 | n_tokens = 9609, memory_seq_rm [9609, end)
slot update_slots: id 1 | task 93910 | prompt processing progress, n_tokens = 10117, batch.n_tokens = 508, progress = 0.999605
slot create_check: id 1 | task 93910 | created context checkpoint 9 of 64 (pos_min = 9608, pos_max = 9608, n_tokens = 9609, size = 149.626 MiB)
slot update_slots: id 1 | task 93910 | n_tokens = 10117, memory_seq_rm [10117, end)
reasoning-budget: activated, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
slot init_sampler: id 1 | task 93910 | init sampler, took 1.08 ms, tokens: text = 10121, total = 10121
slot update_slots: id 1 | task 93910 | prompt processing done, n_tokens = 10121, batch.n_tokens = 4
slot create_check: id 1 | task 93910 | created context checkpoint 10 of 64 (pos_min = 10116, pos_max = 10116, n_tokens = 10117, size = 149.626 MiB)
srv log_server_r: done request: POST /v1/chat/completions 127.0.0.1 200
reasoning-budget: deactivated (natural end)
slot print_timing: id 1 | task 93910 |
prompt eval time = 4096.00 ms / 724 tokens ( 5.66 ms per token, 176.76 tokens per second)
eval time = 24213.29 ms / 250 tokens ( 96.85 ms per token, 10.32 tokens per second)
total time = 28309.29 ms / 974 tokens
slot release: id 1 | task 93910 | stop processing: n_tokens = 10370, truncated = 0
srv update_slots: all slots are idle
srv params_from_: Chat format: peg-native
slot get_availabl: id 1 | task -1 | selected slot by LCP similarity, sim_best = 0.902 (> 0.100 thold), f_keep = 1.000
slot launch_slot_: id 1 | task -1 | sampler chain: logits -> ?penalties -> ?dry -> ?top-n-sigma -> top-k -> ?typical -> ?top-p -> ?min-p -> ?xtc -> temp-ext -> dist
slot launch_slot_: id 1 | task 94163 | processing task, is_child = 0
slot update_slots: id 1 | task 94163 | new prompt, n_ctx_slot = 65536, n_keep = 0, task.n_tokens = 11502
slot update_slots: id 1 | task 94163 | n_tokens = 10370, memory_seq_rm [10370, end)
slot update_slots: id 1 | task 94163 | prompt processing progress, n_tokens = 10882, batch.n_tokens = 512, progress = 0.946096
slot update_slots: id 1 | task 94163 | n_tokens = 10882, memory_seq_rm [10882, end)
slot update_slots: id 1 | task 94163 | prompt processing progress, n_tokens = 10990, batch.n_tokens = 108, progress = 0.955486
slot update_slots: id 1 | task 94163 | n_tokens = 10990, memory_seq_rm [10990, end)
slot update_slots: id 1 | task 94163 | prompt processing progress, n_tokens = 11498, batch.n_tokens = 508, progress = 0.999652
slot create_check: id 1 | task 94163 | created context checkpoint 11 of 64 (pos_min = 10989, pos_max = 10989, n_tokens = 10990, size = 149.626 MiB)
slot update_slots: id 1 | task 94163 | n_tokens = 11498, memory_seq_rm [11498, end)
reasoning-budget: activated, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
slot init_sampler: id 1 | task 94163 | init sampler, took 1.22 ms, tokens: text = 11502, total = 11502
slot update_slots: id 1 | task 94163 | prompt processing done, n_tokens = 11502, batch.n_tokens = 4
slot create_check: id 1 | task 94163 | created context checkpoint 12 of 64 (pos_min = 11497, pos_max = 11497, n_tokens = 11498, size = 149.626 MiB)
srv log_server_r: done request: POST /v1/chat/completions 127.0.0.1 200
reasoning-budget: deactivated (natural end)
slot print_timing: id 1 | task 94163 |
prompt eval time = 6487.16 ms / 1132 tokens ( 5.73 ms per token, 174.50 tokens per second)
eval time = 17400.56 ms / 177 tokens ( 98.31 ms per token, 10.17 tokens per second)
total time = 23887.72 ms / 1309 tokens
slot release: id 1 | task 94163 | stop processing: n_tokens = 11678, truncated = 0
srv update_slots: all slots are idle
srv params_from_: Chat format: peg-native
slot get_availabl: id 1 | task -1 | selected slot by LCP similarity, sim_best = 0.998 (> 0.100 thold), f_keep = 1.000
slot launch_slot_: id 1 | task -1 | sampler chain: logits -> ?penalties -> ?dry -> ?top-n-sigma -> top-k -> ?typical -> ?top-p -> ?min-p -> ?xtc -> temp-ext -> dist
slot launch_slot_: id 1 | task 94344 | processing task, is_child = 0
slot update_slots: id 1 | task 94344 | new prompt, n_ctx_slot = 65536, n_keep = 0, task.n_tokens = 11697
slot update_slots: id 1 | task 94344 | n_tokens = 11678, memory_seq_rm [11678, end)
slot update_slots: id 1 | task 94344 | prompt processing progress, n_tokens = 11693, batch.n_tokens = 15, progress = 0.999658
slot create_check: id 1 | task 94344 | created context checkpoint 13 of 64 (pos_min = 11677, pos_max = 11677, n_tokens = 11678, size = 149.626 MiB)
slot update_slots: id 1 | task 94344 | n_tokens = 11693, memory_seq_rm [11693, end)
reasoning-budget: activated, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
slot init_sampler: id 1 | task 94344 | init sampler, took 1.24 ms, tokens: text = 11697, total = 11697
slot update_slots: id 1 | task 94344 | prompt processing done, n_tokens = 11697, batch.n_tokens = 4
srv log_server_r: done request: POST /v1/chat/completions 127.0.0.1 200
reasoning-budget: deactivated (natural end)
slot print_timing: id 1 | task 94344 |
prompt eval time = 400.78 ms / 19 tokens ( 21.09 ms per token, 47.41 tokens per second)
eval time = 8571.52 ms / 87 tokens ( 98.52 ms per token, 10.15 tokens per second)
total time = 8972.30 ms / 106 tokens
slot release: id 1 | task 94344 | stop processing: n_tokens = 11783, truncated = 0
srv update_slots: all slots are idle
srv params_from_: Chat format: peg-native
slot get_availabl: id 1 | task -1 | selected slot by LCP similarity, sim_best = 0.986 (> 0.100 thold), f_keep = 1.000
slot launch_slot_: id 1 | task -1 | sampler chain: logits -> ?penalties -> ?dry -> ?top-n-sigma -> top-k -> ?typical -> ?top-p -> ?min-p -> ?xtc -> temp-ext -> dist
slot launch_slot_: id 1 | task 94433 | processing task, is_child = 0
slot update_slots: id 1 | task 94433 | new prompt, n_ctx_slot = 65536, n_keep = 0, task.n_tokens = 11956
slot update_slots: id 1 | task 94433 | n_tokens = 11783, memory_seq_rm [11783, end)
slot update_slots: id 1 | task 94433 | prompt processing progress, n_tokens = 11952, batch.n_tokens = 169, progress = 0.999665
slot create_check: id 1 | task 94433 | created context checkpoint 14 of 64 (pos_min = 11782, pos_max = 11782, n_tokens = 11783, size = 149.626 MiB)
slot update_slots: id 1 | task 94433 | n_tokens = 11952, memory_seq_rm [11952, end)
reasoning-budget: activated, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
reasoning-budget: deactivated (natural end)
reasoning-budget: re-activated on new start tag, budget=2147483647 tokens
slot init_sampler: id 1 | task 94433 | init sampler, took 1.15 ms, tokens: text = 11956, total = 11956
slot update_slots: id 1 | task 94433 | prompt processing done, n_tokens = 11956, batch.n_tokens = 4
slot create_check: id 1 | task 94433 | created context checkpoint 15 of 64 (pos_min = 11951, pos_max = 11951, n_tokens = 11952, size = 149.626 MiB)
srv log_server_r: done request: POST /v1/chat/completions 127.0.0.1 200
reasoning-budget: deactivated (natural end)
slot print_timing: id 1 | task 94433 |
prompt eval time = 1289.55 ms / 173 tokens ( 7.45 ms per token, 134.16 tokens per second)
eval time = 17754.35 ms / 179 tokens ( 99.19 ms per token, 10.08 tokens per second)
total time = 19043.89 ms / 352 tokens
slot release: id 1 | task 94433 | stop processing: n_tokens = 12134, truncated = 0
srv update_slots: all slots are idle
srv params_from_: Chat format: peg-native
slot get_availabl: id 0 | task -1 | selected slot by LRU, t_last = 65026105309
slot launch_slot_: id 0 | task -1 | sampler chain: logits -> ?penalties -> ?dry -> ?top-n-sigma -> top-k -> ?typical -> ?top-p -> ?min-p -> ?xtc -> temp-ext -> dist
slot launch_slot_: id 0 | task 94614 | processing task, is_child = 0
slot update_slots: id 0 | task 94614 | new prompt, n_ctx_slot = 65536, n_keep = 0, task.n_tokens = 86082
srv send_error: task id = 94614, error: request (86082 tokens) exceeds the available context size (65536 tokens), try increasing it
slot release: id 0 | task 94614 | stop processing: n_tokens = 12947, truncated = 0
srv stop: cancel task, id_task = 94614
srv update_slots: no tokens to decode
srv update_slots: all slots are idle
srv log_server_r: done request: POST /v1/chat/completions 127.0.0.1 400
srv params_from_: Chat format: peg-native
slot get_availabl: id 1 | task -1 | selected slot by LRU, t_last = 65147626039
slot launch_slot_: id 1 | task -1 | sampler chain: logits -> ?penalties -> ?dry -> ?top-n-sigma -> top-k -> ?typical -> ?top-p -> ?min-p -> ?xtc -> temp-ext -> dist
slot launch_slot_: id 1 | task 94617 | processing task, is_child = 0
slot update_slots: id 1 | task 94617 | new prompt, n_ctx_slot = 65536, n_keep = 0, task.n_tokens = 53564
slot update_slots: id 1 | task 94617 | n_past = 3, slot.prompt.tokens.size() = 12134, seq_id = 1, pos_min = 12133, n_swa = 0
slot update_slots: id 1 | task 94617 | Checking checkpoint with [11951, 11951] against 3...
slot update_slots: id 1 | task 94617 | Checking checkpoint with [11782, 11782] against 3...
slot update_slots: id 1 | task 94617 | Checking checkpoint with [11677, 11677] against 3...
slot update_slots: id 1 | task 94617 | Checking checkpoint with [11497, 11497] against 3...
slot update_slots: id 1 | task 94617 | Checking checkpoint with [10989, 10989] against 3...
slot update_slots: id 1 | task 94617 | Checking checkpoint with [10116, 10116] against 3...
slot update_slots: id 1 | task 94617 | Checking checkpoint with [9608, 9608] against 3...
slot update_slots: id 1 | task 94617 | Checking checkpoint with [9282, 9282] against 3...
slot update_slots: id 1 | task 94617 | Checking checkpoint with [8886, 8886] against 3...
slot update_slots: id 1 | task 94617 | Checking checkpoint with [8590, 8590] against 3...
slot update_slots: id 1 | task 94617 | Checking checkpoint with [8447, 8447] against 3...
slot update_slots: id 1 | task 94617 | Checking checkpoint with [8249, 8249] against 3...
slot update_slots: id 1 | task 94617 | Checking checkpoint with [8162, 8162] against 3...
slot update_slots: id 1 | task 94617 | Checking checkpoint with [7909, 7909] against 3...
slot update_slots: id 1 | task 94617 | Checking checkpoint with [7401, 7401] against 3...
slot update_slots: id 1 | task 94617 | forcing full prompt re-processing due to lack of cache data (likely due to SWA or hybrid/recurrent memory, see https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055)
slot update_slots: id 1 | task 94617 | erased invalidated context checkpoint (pos_min = 7401, pos_max = 7401, n_tokens = 7402, n_swa = 0, pos_next = 0, size = 149.626 MiB)
slot update_slots: id 1 | task 94617 | erased invalidated context checkpoint (pos_min = 7909, pos_max = 7909, n_tokens = 7910, n_swa = 0, pos_next = 0, size = 149.626 MiB)
slot update_slots: id 1 | task 94617 | erased invalidated context checkpoint (pos_min = 8162, pos_max = 8162, n_tokens = 8163, n_swa = 0, pos_next = 0, size = 149.626 MiB)
slot update_slots: id 1 | task 94617 | erased invalidated context checkpoint (pos_min = 8249, pos_max = 8249, n_tokens = 8250, n_swa = 0, pos_next = 0, size = 149.626 MiB)
slot update_slots: id 1 | task 94617 | erased invalidated context checkpoint (pos_min = 8447, pos_max = 8447, n_tokens = 8448, n_swa = 0, pos_next = 0, size = 149.626 MiB)
slot update_slots: id 1 | task 94617 | erased invalidated context checkpoint (pos_min = 8590, pos_max = 8590, n_tokens = 8591, n_swa = 0, pos_next = 0, size = 149.626 MiB)
slot update_slots: id 1 | task 94617 | erased invalidated context checkpoint (pos_min = 8886, pos_max = 8886, n_tokens = 8887, n_swa = 0, pos_next = 0, size = 149.626 MiB)
slot update_slots: id 1 | task 94617 | erased invalidated context checkpoint (pos_min = 9282, pos_max = 9282, n_tokens = 9283, n_swa = 0, pos_next = 0, size = 149.626 MiB)
slot update_slots: id 1 | task 94617 | erased invalidated context checkpoint (pos_min = 9608, pos_max = 9608, n_tokens = 9609, n_swa = 0, pos_next = 0, size = 149.626 MiB)
slot update_slots: id 1 | task 94617 | erased invalidated context checkpoint (pos_min = 10116, pos_max = 10116, n_tokens = 10117, n_swa = 0, pos_next = 0, size = 149.626 MiB)
slot update_slots: id 1 | task 94617 | erased invalidated context checkpoint (pos_min = 10989, pos_max = 10989, n_tokens = 10990, n_swa = 0, pos_next = 0, size = 149.626 MiB)
slot update_slots: id 1 | task 94617 | erased invalidated context checkpoint (pos_min = 11497, pos_max = 11497, n_tokens = 11498, n_swa = 0, pos_next = 0, size = 149.626 MiB)
slot update_slots: id 1 | task 94617 | erased invalidated context checkpoint (pos_min = 11677, pos_max = 11677, n_tokens = 11678, n_swa = 0, pos_next = 0, size = 149.626 MiB)
slot update_slots: id 1 | task 94617 | erased invalidated context checkpoint (pos_min = 11782, pos_max = 11782, n_tokens = 11783, n_swa = 0, pos_next = 0, size = 149.626 MiB)
slot update_slots: id 1 | task 94617 | erased invalidated context checkpoint (pos_min = 11951, pos_max = 11951, n_tokens = 11952, n_swa = 0, pos_next = 0, size = 149.626 MiB)
slot update_slots: id 1 | task 94617 | n_tokens = 0, memory_seq_rm [0, end)
slot update_slots: id 1 | task 94617 | prompt processing progress, n_tokens = 512, batch.n_tokens = 512, progress = 0.009559
slot update_slots: id 1 | task 94617 | n_tokens = 512, memory_seq_rm [512, end)
slot update_slots: id 1 | task 94617 | prompt processing progress, n_tokens = 1024, batch.n_tokens = 512, progress = 0.019117
slot update_slots: id 1 | task 94617 | n_tokens = 1024, memory_seq_rm [1024, end)
slot update_slots: id 1 | task 94617 | prompt processing progress, n_tokens = 1536, batch.n_tokens = 512, progress = 0.028676
slot update_slots: id 1 | task 94617 | n_tokens = 1536, memory_seq_rm [1536, end)
slot update_slots: id 1 | task 94617 | prompt processing progress, n_tokens = 2048, batch.n_tokens = 512, progress = 0.03823
Name and Version
PS D:\llm\llama-b8977-bin-win-hip-radeon-x64> .\llama-cli --version
HIP Library Path: C:\WINDOWS\SYSTEM32\amdhip64_7.dll
ggml_cuda_init: found 1 ROCm devices (Total VRAM: 16368 MiB):
Device 0: AMD Radeon RX 7800 XT, gfx1101 (0x1101), VMM: no, Wave Size: 32, VRAM: 16368 MiB
load_backend: loaded ROCm backend from D:\llm\llama-b8977-bin-win-hip-radeon-x64\ggml-hip.dll
load_backend: loaded RPC backend from D:\llm\llama-b8977-bin-win-hip-radeon-x64\ggml-rpc.dll
load_backend: loaded CPU backend from D:\llm\llama-b8977-bin-win-hip-radeon-x64\ggml-cpu-zen4.dll
version: 8977 (b1d5f5b)
built with Clang 19.1.5 for Windows x86_64
Operating systems
Windows
GGML backends
HIP
Hardware
PC Spec:
Ryzen 7700x
32GB DDR5 5200 Mhz
16 GB 7800 XT
2TB nvme SSD
Models
Qwen 3.6 27B @ IQ4_XS / Q4_K_M / UD-IQ3_XXS / Q3_K_M
Problem description & steps to reproduce
llama server start up script... also i am using opencode as the harness
First Bad Commit
No response
Relevant log output
Logs