Name and Version
b9251
Operating systems
Linux
GGML backends
Vulkan
Hardware
rx9060xt+rx6600. Cpu:ryzen 5 5500
Models
qwen 3.6 27b
Problem description & steps to reproduce
when i run qwen 3.6 27b everything is normal until i try to use vision capabilities. during image porcessing it proccesses %1 but then gets stuck
First Bad Commit
No response
Relevant log output
Logs
RADV_PERFTEST=nogttspill ./llama-server -m "/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/Qwen3.6-27B-UD-Q4_K_XL.gguf" --mmproj "/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/mmproj-BF16.gguf" -fa on --ctx-size 24576 -ngl auto --parallel 1 --spec-type draft-mtp --spec-draft-n-max 2
0.00.083.529 I log_info: verbosity = 3 (adjust with the -lv N CLI arg)
0.00.083.533 I device_info:
0.00.083.767 I - Vulkan0 : AMD Radeon RX 6600 (RADV NAVI23) (8176 MiB, 7683 MiB free)
0.00.083.956 I - Vulkan1 : AMD Radeon RX 9060 XT (RADV GFX1200) (16304 MiB, 16246 MiB free)
0.00.083.963 I - CPU : AMD Ryzen 5 5500 (30958 MiB, 30958 MiB free)
0.00.084.008 I system_info: n_threads = 6 (n_threads_batch = 6) / 12 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | LLAMAFILE = 1 | OPENMP = 1 | REPACK = 1 |
0.00.084.120 I srv init: running without SSL
0.00.084.156 I srv init: using 11 threads for HTTP server
0.00.084.469 I srv start: binding port with default address family
0.00.085.647 I srv main: loading model
0.00.085.650 I srv load_model: loading model '/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/Qwen3.6-27B-UD-Q4_K_XL.gguf'
0.01.386.916 I srv load_model: [mtmd] estimated memory usage of mmproj is 1161.02 MiB
0.01.386.940 I common_init_result: fitting params to device memory ...
0.01.386.948 I common_init_result: (for bugs during this step try to reproduce them with -fit off, or provide --verbose logs if the bug only occurs with -fit on)
0.15.638.881 W llama_context: n_ctx_seq (24576) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
0.15.711.560 I common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
0.15.845.240 I srv load_model: creating MTP draft context against the target model '/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/Qwen3.6-27B-UD-Q4_K_XL.gguf'
0.15.845.267 W llama_context: n_ctx_seq (24576) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
0.15.881.653 W load_hparams: Qwen-VL models require at minimum 1024 image tokens to function correctly on grounding tasks
0.15.881.656 W load_hparams: if you encounter problems with accuracy, try adding --image-min-tokens 1024
0.15.881.656 W load_hparams: more info: #16842
0.16.577.950 I srv load_model: loaded multimodal model, '/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/mmproj-BF16.gguf'
0.16.577.955 I srv load_model: initializing slots, n_slots = 1
0.16.679.848 I common_context_can_seq_rm: the context supports bounded partial sequence removal
0.16.695.964 I common_speculative_impl_draft_mtp: adding speculative implementation 'draft-mtp'
0.16.695.972 I common_speculative_impl_draft_mtp: - n_max=2, n_min=0, p_min=0.00, n_embd=5120
0.16.695.973 I common_speculative_impl_draft_mtp: - gpu_layers=-1, cache_k=f16, cache_v=f16, ctx_tgt=yes, ctx_dft=yes, devices=[default]
0.16.696.080 I srv load_model: speculative decoding context initialized
0.16.696.083 I slot load_model: id 0 | task -1 | new slot, n_ctx = 24576
0.16.696.116 I srv load_model: prompt cache is enabled, size limit: 8192 MiB
0.16.696.116 I srv load_model: use --cache-ram 0 to disable the prompt cache
0.16.696.117 I srv load_model: for more info see #16391
0.16.696.130 W srv init: --cache-idle-slots requires --kv-unified, disabling
0.16.711.843 I init: chat template, example_format: '<|im_start|>system
You are a helpful assistant<|im_end|>
<|im_start|>user
Hello<|im_end|>
<|im_start|>assistant
Hi there<|im_end|>
<|im_start|>user
How are you?<|im_end|>
<|im_start|>assistant
'
0.16.722.455 I srv init: init: chat template, thinking = 1
0.16.722.473 I srv main: model loaded
0.16.722.474 I srv main: server is listening on http://127.0.0.1:8080
0.16.722.476 I srv update_slots: all slots are idle
0.22.392.511 I srv params_from_: Chat format: peg-native
0.22.393.071 I slot get_availabl: id 0 | task -1 | selected slot by LRU, t_last = -1
0.22.393.076 I srv get_availabl: updating prompt cache
0.22.393.081 I srv load: - looking for better prompt, base f_keep = -1.000, sim = 0.000
0.22.393.085 I srv update: - cache state: 0 prompts, 0.000 MiB (limits: 8192.000 MiB, 24576 tokens, 8589934592 est)
0.22.393.088 I srv get_availabl: prompt cache update took 0.01 ms
0.22.393.169 I slot launch_slot_: id 0 | task 0 | processing task, is_child = 0
0.22.517.769 I srv process_chun: processing image...
0.25.446.327 W find_slot: non-consecutive token position 4 after 3 for sequence 0 with 512 new tokens
0.25.446.332 W find_slot: non-consecutive token position 4 after 4 for sequence 0 with 512 new tokens
0.25.446.333 W find_slot: non-consecutive token position 4 after 4 for sequence 0 with 512 new tokens
0.25.446.334 W find_slot: non-consecutive token position 4 after 4 for sequence 0 with 512 new tokens
0.25.446.974 W find_slot: non-consecutive token position 4 after 3 for sequence 0 with 512 new tokens
radv/amdgpu: The CS has been cancelled because the context is lost. This context is innocent.
[New LWP 20636]
[New LWP 20635]
[New LWP 20634]
[New LWP 20633]
[New LWP 20632]
[New LWP 20631]
[New LWP 20630]
[New LWP 20629]
[New LWP 20628]
[New LWP 20627]
[New LWP 20626]
[New LWP 20625]
[New LWP 20624]
[New LWP 20623]
[New LWP 20622]
[New LWP 20621]
[New LWP 20619]
This GDB supports auto-downloading debuginfo from the following URLs:
https://debuginfod.ubuntu.com
Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/x86_64-linux-gnu/libthread_db.so.1".
__syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
⚠️ warning: 56 ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S: No such file or directory
#0 __syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
56 in ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S
#1 0x00007c71f42a067c in __internal_syscall_cancel (a1=, a2=, a3=, a4=, a5=0, a6=0, nr=61) at ./nptl/cancellation.c:49
⚠️ warning: 49 ./nptl/cancellation.c: No such file or directory
#2 __syscall_cancel (a1=, a2=, a3=, a4=, a5=a5@entry=0, a6=a6@entry=0, nr=61) at ./nptl/cancellation.c:75
75 in ./nptl/cancellation.c
#3 0x00007c71f431cdcf in __GI___wait4 (pid=, stat_loc=, options=, usage=) at ../sysdeps/unix/sysv/linux/wait4.c:30
⚠️ warning: 30 ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory
#4 0x00007c71f495b73b in ggml_print_backtrace () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-base.so.0
#5 0x00007c71f496f56f in ggml_uncaught_exception() () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-base.so.0
#6 0x00007c71f46c364a in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x00007c71f46abc6c in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8 0x00007c71f46c3901 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00007c71e8889ee7 in ggml_vk_submit(std::shared_ptr<vk_context_struct>&, vk::Fence) [clone .cold] () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-vulkan.so
#10 0x00007c71e895180a in ggml_vk_preallocate_buffers(ggml_backend_vk_context*, std::shared_ptr<vk_context_struct>) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-vulkan.so
#11 0x00007c71e89568f2 in ggml_vk_mul_mat_q_f16(ggml_backend_vk_context*, std::shared_ptr<vk_context_struct>&, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, bool) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-vulkan.so
#12 0x00007c71e898332c in ggml_vk_build_graph(ggml_backend_vk_context*, ggml_cgraph*, int, ggml_tensor*, int, bool, bool, bool) [clone .isra.0] () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-vulkan.so
#13 0x00007c71e8985179 in ggml_backend_vk_graph_compute(ggml_backend*, ggml_cgraph*) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-vulkan.so
#14 0x00007c71f49797df in ggml_backend_sched_graph_compute_async () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-base.so.0
#15 0x00007c71f4ad1fa1 in llama_context::graph_compute(ggml_cgraph*, bool) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libllama.so.0
#16 0x00007c71f4ad23e5 in llama_context::process_ubatch(llama_ubatch const&, llm_graph_type, llama_memory_context_i*, ggml_status&) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libllama.so.0
#17 0x00007c71f4adaf57 in llama_context::decode(llama_batch const&) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libllama.so.0
#18 0x00007c71f4adc6f0 in llama_decode () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libllama.so.0
#19 0x00007c71f4d8fe38 in mtmd_helper_decode_image_chunk () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libmtmd.so.0
#20 0x00007c71f4d910ee in mtmd_helper_eval_chunk_single () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libmtmd.so.0
#21 0x000059a0f4aed165 in server_tokens::process_chunk(llama_context*, mtmd_context*, unsigned long, int, int, unsigned long&) const ()
#22 0x000059a0f4b3f8f4 in server_context_impl::update_slots() ()
#23 0x000059a0f4bbc961 in server_queue::start_loop(long) ()
#24 0x000059a0f4a961ef in main ()
[Inferior 1 (process 20615) detached]
terminate called after throwing an instance of 'vk::DeviceLostError'
what(): vk::Queue::submit: ErrorDeviceLost
Aborted (core dumped) RADV_PERFTEST=nogttspill ./llama-server -m "/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/Qwen3.6-27B-UD-Q4_K_XL.gguf" --mmproj "/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/mmproj-BF16.gguf" -fa on --ctx-size 24576 -ngl auto --parallel 1 --spec-type draft-mtp --spec-draft-n-max 2
Name and Version
b9251
Operating systems
Linux
GGML backends
Vulkan
Hardware
rx9060xt+rx6600. Cpu:ryzen 5 5500
Models
qwen 3.6 27b
Problem description & steps to reproduce
when i run qwen 3.6 27b everything is normal until i try to use vision capabilities. during image porcessing it proccesses %1 but then gets stuck
First Bad Commit
No response
Relevant log output
Logs
RADV_PERFTEST=nogttspill ./llama-server -m "/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/Qwen3.6-27B-UD-Q4_K_XL.gguf" --mmproj "/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/mmproj-BF16.gguf" -fa on --ctx-size 24576 -ngl auto --parallel 1 --spec-type draft-mtp --spec-draft-n-max 2
0.00.083.529 I log_info: verbosity = 3 (adjust with the
-lv NCLI arg)0.00.083.533 I device_info:
0.00.083.767 I - Vulkan0 : AMD Radeon RX 6600 (RADV NAVI23) (8176 MiB, 7683 MiB free)
0.00.083.956 I - Vulkan1 : AMD Radeon RX 9060 XT (RADV GFX1200) (16304 MiB, 16246 MiB free)
0.00.083.963 I - CPU : AMD Ryzen 5 5500 (30958 MiB, 30958 MiB free)
0.00.084.008 I system_info: n_threads = 6 (n_threads_batch = 6) / 12 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | LLAMAFILE = 1 | OPENMP = 1 | REPACK = 1 |
0.00.084.120 I srv init: running without SSL
0.00.084.156 I srv init: using 11 threads for HTTP server
0.00.084.469 I srv start: binding port with default address family
0.00.085.647 I srv main: loading model
0.00.085.650 I srv load_model: loading model '/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/Qwen3.6-27B-UD-Q4_K_XL.gguf'
0.01.386.916 I srv load_model: [mtmd] estimated memory usage of mmproj is 1161.02 MiB
0.01.386.940 I common_init_result: fitting params to device memory ...
0.01.386.948 I common_init_result: (for bugs during this step try to reproduce them with -fit off, or provide --verbose logs if the bug only occurs with -fit on)
0.15.638.881 W llama_context: n_ctx_seq (24576) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
0.15.711.560 I common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
0.15.845.240 I srv load_model: creating MTP draft context against the target model '/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/Qwen3.6-27B-UD-Q4_K_XL.gguf'
0.15.845.267 W llama_context: n_ctx_seq (24576) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
0.15.881.653 W load_hparams: Qwen-VL models require at minimum 1024 image tokens to function correctly on grounding tasks
0.15.881.656 W load_hparams: if you encounter problems with accuracy, try adding --image-min-tokens 1024
0.15.881.656 W load_hparams: more info: #16842
0.16.577.950 I srv load_model: loaded multimodal model, '/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/mmproj-BF16.gguf'
0.16.577.955 I srv load_model: initializing slots, n_slots = 1
0.16.679.848 I common_context_can_seq_rm: the context supports bounded partial sequence removal
0.16.695.964 I common_speculative_impl_draft_mtp: adding speculative implementation 'draft-mtp'
0.16.695.972 I common_speculative_impl_draft_mtp: - n_max=2, n_min=0, p_min=0.00, n_embd=5120
0.16.695.973 I common_speculative_impl_draft_mtp: - gpu_layers=-1, cache_k=f16, cache_v=f16, ctx_tgt=yes, ctx_dft=yes, devices=[default]
0.16.696.080 I srv load_model: speculative decoding context initialized
0.16.696.083 I slot load_model: id 0 | task -1 | new slot, n_ctx = 24576
0.16.696.116 I srv load_model: prompt cache is enabled, size limit: 8192 MiB
0.16.696.116 I srv load_model: use
--cache-ram 0to disable the prompt cache0.16.696.117 I srv load_model: for more info see #16391
0.16.696.130 W srv init: --cache-idle-slots requires --kv-unified, disabling
0.16.711.843 I init: chat template, example_format: '<|im_start|>system
You are a helpful assistant<|im_end|>
<|im_start|>user
Hello<|im_end|>
<|im_start|>assistant
Hi there<|im_end|>
<|im_start|>user
How are you?<|im_end|>
<|im_start|>assistant
'
0.16.722.455 I srv init: init: chat template, thinking = 1
0.16.722.473 I srv main: model loaded
0.16.722.474 I srv main: server is listening on http://127.0.0.1:8080
0.16.722.476 I srv update_slots: all slots are idle
0.22.392.511 I srv params_from_: Chat format: peg-native
0.22.393.071 I slot get_availabl: id 0 | task -1 | selected slot by LRU, t_last = -1
0.22.393.076 I srv get_availabl: updating prompt cache
0.22.393.081 I srv load: - looking for better prompt, base f_keep = -1.000, sim = 0.000
0.22.393.085 I srv update: - cache state: 0 prompts, 0.000 MiB (limits: 8192.000 MiB, 24576 tokens, 8589934592 est)
0.22.393.088 I srv get_availabl: prompt cache update took 0.01 ms
0.22.393.169 I slot launch_slot_: id 0 | task 0 | processing task, is_child = 0
0.22.517.769 I srv process_chun: processing image...
0.25.446.327 W find_slot: non-consecutive token position 4 after 3 for sequence 0 with 512 new tokens
0.25.446.332 W find_slot: non-consecutive token position 4 after 4 for sequence 0 with 512 new tokens
0.25.446.333 W find_slot: non-consecutive token position 4 after 4 for sequence 0 with 512 new tokens
0.25.446.334 W find_slot: non-consecutive token position 4 after 4 for sequence 0 with 512 new tokens
0.25.446.974 W find_slot: non-consecutive token position 4 after 3 for sequence 0 with 512 new tokens
radv/amdgpu: The CS has been cancelled because the context is lost. This context is innocent.
[New LWP 20636]
[New LWP 20635]
[New LWP 20634]
[New LWP 20633]
[New LWP 20632]
[New LWP 20631]
[New LWP 20630]
[New LWP 20629]
[New LWP 20628]
[New LWP 20627]
[New LWP 20626]
[New LWP 20625]
[New LWP 20624]
[New LWP 20623]
[New LWP 20622]
[New LWP 20621]
[New LWP 20619]
This GDB supports auto-downloading debuginfo from the following URLs:
⚠️ warning: 56 ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S: No such file or directory
⚠️ warning: 49 ./nptl/cancellation.c: No such file or directory
⚠️ warning: 30 ../sysdeps/unix/sysv/linux/wait4.c: No such file or directory
https://debuginfod.ubuntu.com
Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/x86_64-linux-gnu/libthread_db.so.1".
__syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
#0 __syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
56 in ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S
#1 0x00007c71f42a067c in __internal_syscall_cancel (a1=, a2=, a3=, a4=, a5=0, a6=0, nr=61) at ./nptl/cancellation.c:49
#2 __syscall_cancel (a1=, a2=, a3=, a4=, a5=a5@entry=0, a6=a6@entry=0, nr=61) at ./nptl/cancellation.c:75
75 in ./nptl/cancellation.c
#3 0x00007c71f431cdcf in __GI___wait4 (pid=, stat_loc=, options=, usage=) at ../sysdeps/unix/sysv/linux/wait4.c:30
#4 0x00007c71f495b73b in ggml_print_backtrace () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-base.so.0
#5 0x00007c71f496f56f in ggml_uncaught_exception() () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-base.so.0
#6 0x00007c71f46c364a in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7 0x00007c71f46abc6c in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8 0x00007c71f46c3901 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9 0x00007c71e8889ee7 in ggml_vk_submit(std::shared_ptr<vk_context_struct>&, vk::Fence) [clone .cold] () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-vulkan.so
#10 0x00007c71e895180a in ggml_vk_preallocate_buffers(ggml_backend_vk_context*, std::shared_ptr<vk_context_struct>) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-vulkan.so
#11 0x00007c71e89568f2 in ggml_vk_mul_mat_q_f16(ggml_backend_vk_context*, std::shared_ptr<vk_context_struct>&, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, bool) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-vulkan.so
#12 0x00007c71e898332c in ggml_vk_build_graph(ggml_backend_vk_context*, ggml_cgraph*, int, ggml_tensor*, int, bool, bool, bool) [clone .isra.0] () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-vulkan.so
#13 0x00007c71e8985179 in ggml_backend_vk_graph_compute(ggml_backend*, ggml_cgraph*) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-vulkan.so
#14 0x00007c71f49797df in ggml_backend_sched_graph_compute_async () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-base.so.0
#15 0x00007c71f4ad1fa1 in llama_context::graph_compute(ggml_cgraph*, bool) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libllama.so.0
#16 0x00007c71f4ad23e5 in llama_context::process_ubatch(llama_ubatch const&, llm_graph_type, llama_memory_context_i*, ggml_status&) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libllama.so.0
#17 0x00007c71f4adaf57 in llama_context::decode(llama_batch const&) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libllama.so.0
#18 0x00007c71f4adc6f0 in llama_decode () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libllama.so.0
#19 0x00007c71f4d8fe38 in mtmd_helper_decode_image_chunk () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libmtmd.so.0
#20 0x00007c71f4d910ee in mtmd_helper_eval_chunk_single () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libmtmd.so.0
#21 0x000059a0f4aed165 in server_tokens::process_chunk(llama_context*, mtmd_context*, unsigned long, int, int, unsigned long&) const ()
#22 0x000059a0f4b3f8f4 in server_context_impl::update_slots() ()
#23 0x000059a0f4bbc961 in server_queue::start_loop(long) ()
#24 0x000059a0f4a961ef in main ()
[Inferior 1 (process 20615) detached]
terminate called after throwing an instance of 'vk::DeviceLostError'
what(): vk::Queue::submit: ErrorDeviceLost
Aborted (core dumped) RADV_PERFTEST=nogttspill ./llama-server -m "/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/Qwen3.6-27B-UD-Q4_K_XL.gguf" --mmproj "/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/mmproj-BF16.gguf" -fa on --ctx-size 24576 -ngl auto --parallel 1 --spec-type draft-mtp --spec-draft-n-max 2