Eval bug: Core dumped when running qwen 3.6 27b on vulkan. No-mmproj works

### Name and Version

b9251

### Operating systems

Linux

### GGML backends

Vulkan

### Hardware

rx9060xt+rx6600. Cpu:ryzen 5 5500

### Models

qwen 3.6 27b

### Problem description & steps to reproduce

when  i run qwen 3.6 27b everything is normal until i try to use vision capabilities. during image porcessing it proccesses %1 but then gets stuck

### First Bad Commit

_No response_

### Relevant log output

<details>
<summary>Logs</summary>


```console

```
</details>


RADV_PERFTEST=nogttspill ./llama-server -m "/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/Qwen3.6-27B-UD-Q4_K_XL.gguf" --mmproj "/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/mmproj-BF16.gguf" -fa on --ctx-size 24576 -ngl auto --parallel 1 --spec-type draft-mtp --spec-draft-n-max 2
0.00.083.529 I log_info: verbosity = 3 (adjust with the `-lv N` CLI arg)
0.00.083.533 I device_info:
0.00.083.767 I   - Vulkan0 : AMD Radeon RX 6600 (RADV NAVI23) (8176 MiB, 7683 MiB free)
0.00.083.956 I   - Vulkan1 : AMD Radeon RX 9060 XT (RADV GFX1200) (16304 MiB, 16246 MiB free)
0.00.083.963 I   - CPU     : AMD Ryzen 5 5500 (30958 MiB, 30958 MiB free)
0.00.084.008 I system_info: n_threads = 6 (n_threads_batch = 6) / 12 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | LLAMAFILE = 1 | OPENMP = 1 | REPACK = 1 | 
0.00.084.120 I srv          init: running without SSL
0.00.084.156 I srv          init: using 11 threads for HTTP server
0.00.084.469 I srv         start: binding port with default address family
0.00.085.647 I srv          main: loading model
0.00.085.650 I srv    load_model: loading model '/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/Qwen3.6-27B-UD-Q4_K_XL.gguf'
0.01.386.916 I srv    load_model: [mtmd] estimated memory usage of mmproj is 1161.02 MiB
0.01.386.940 I common_init_result: fitting params to device memory ...
0.01.386.948 I common_init_result: (for bugs during this step try to reproduce them with -fit off, or provide --verbose logs if the bug only occurs with -fit on)
0.15.638.881 W llama_context: n_ctx_seq (24576) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
0.15.711.560 I common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
0.15.845.240 I srv    load_model: creating MTP draft context against the target model '/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/Qwen3.6-27B-UD-Q4_K_XL.gguf'
0.15.845.267 W llama_context: n_ctx_seq (24576) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
0.15.881.653 W load_hparams: Qwen-VL models require at minimum 1024 image tokens to function correctly on grounding tasks
0.15.881.656 W load_hparams: if you encounter problems with accuracy, try adding --image-min-tokens 1024
0.15.881.656 W load_hparams: more info: https://github.com/ggml-org/llama.cpp/issues/16842

0.16.577.950 I srv    load_model: loaded multimodal model, '/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/mmproj-BF16.gguf'
0.16.577.955 I srv    load_model: initializing slots, n_slots = 1
0.16.679.848 I common_context_can_seq_rm: the context supports bounded partial sequence removal
0.16.695.964 I common_speculative_impl_draft_mtp: adding speculative implementation 'draft-mtp'
0.16.695.972 I common_speculative_impl_draft_mtp: - n_max=2, n_min=0, p_min=0.00, n_embd=5120
0.16.695.973 I common_speculative_impl_draft_mtp: - gpu_layers=-1, cache_k=f16, cache_v=f16, ctx_tgt=yes, ctx_dft=yes, devices=[default]
0.16.696.080 I srv    load_model: speculative decoding context initialized
0.16.696.083 I slot   load_model: id  0 | task -1 | new slot, n_ctx = 24576
0.16.696.116 I srv    load_model: prompt cache is enabled, size limit: 8192 MiB
0.16.696.116 I srv    load_model: use `--cache-ram 0` to disable the prompt cache
0.16.696.117 I srv    load_model: for more info see https://github.com/ggml-org/llama.cpp/pull/16391
0.16.696.130 W srv          init: --cache-idle-slots requires --kv-unified, disabling
0.16.711.843 I init: chat template, example_format: '<|im_start|>system
You are a helpful assistant<|im_end|>
<|im_start|>user
Hello<|im_end|>
<|im_start|>assistant
Hi there<|im_end|>
<|im_start|>user
How are you?<|im_end|>
<|im_start|>assistant
<think>
'
0.16.722.455 I srv          init: init: chat template, thinking = 1
0.16.722.473 I srv          main: model loaded
0.16.722.474 I srv          main: server is listening on http://127.0.0.1:8080
0.16.722.476 I srv  update_slots: all slots are idle
0.22.392.511 I srv  params_from_: Chat format: peg-native
0.22.393.071 I slot get_availabl: id  0 | task -1 | selected slot by LRU, t_last = -1
0.22.393.076 I srv  get_availabl: updating prompt cache
0.22.393.081 I srv          load:  - looking for better prompt, base f_keep = -1.000, sim = 0.000
0.22.393.085 I srv        update:  - cache state: 0 prompts, 0.000 MiB (limits: 8192.000 MiB, 24576 tokens, 8589934592 est)
0.22.393.088 I srv  get_availabl: prompt cache update took 0.01 ms
0.22.393.169 I slot launch_slot_: id  0 | task 0 | processing task, is_child = 0
0.22.517.769 I srv  process_chun: processing image...
0.25.446.327 W find_slot: non-consecutive token position 4 after 3 for sequence 0 with 512 new tokens
0.25.446.332 W find_slot: non-consecutive token position 4 after 4 for sequence 0 with 512 new tokens
0.25.446.333 W find_slot: non-consecutive token position 4 after 4 for sequence 0 with 512 new tokens
0.25.446.334 W find_slot: non-consecutive token position 4 after 4 for sequence 0 with 512 new tokens
0.25.446.974 W find_slot: non-consecutive token position 4 after 3 for sequence 0 with 512 new tokens
radv/amdgpu: The CS has been cancelled because the context is lost. This context is innocent.
[New LWP 20636]
[New LWP 20635]
[New LWP 20634]
[New LWP 20633]
[New LWP 20632]
[New LWP 20631]
[New LWP 20630]
[New LWP 20629]
[New LWP 20628]
[New LWP 20627]
[New LWP 20626]
[New LWP 20625]
[New LWP 20624]
[New LWP 20623]
[New LWP 20622]
[New LWP 20621]
[New LWP 20619]

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.ubuntu.com>
Enable debuginfod for this session? (y or [n]) [answered N; input not from terminal]
Debuginfod has been disabled.
To make this setting permanent, add 'set debuginfod enabled off' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/x86_64-linux-gnu/libthread_db.so.1".
__syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
⚠️ warning: 56	../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S: No such file or directory
#0  __syscall_cancel_arch () at ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S:56
56	in ../sysdeps/unix/sysv/linux/x86_64/syscall_cancel.S
#1  0x00007c71f42a067c in __internal_syscall_cancel (a1=<optimized out>, a2=<optimized out>, a3=<optimized out>, a4=<optimized out>, a5=0, a6=0, nr=61) at ./nptl/cancellation.c:49
⚠️ warning: 49	./nptl/cancellation.c: No such file or directory
#2  __syscall_cancel (a1=<optimized out>, a2=<optimized out>, a3=<optimized out>, a4=<optimized out>, a5=a5@entry=0, a6=a6@entry=0, nr=61) at ./nptl/cancellation.c:75
75	in ./nptl/cancellation.c
#3  0x00007c71f431cdcf in __GI___wait4 (pid=<optimized out>, stat_loc=<optimized out>, options=<optimized out>, usage=<optimized out>) at ../sysdeps/unix/sysv/linux/wait4.c:30
⚠️ warning: 30	../sysdeps/unix/sysv/linux/wait4.c: No such file or directory
#4  0x00007c71f495b73b in ggml_print_backtrace () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-base.so.0
#5  0x00007c71f496f56f in ggml_uncaught_exception() () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-base.so.0
#6  0x00007c71f46c364a in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007c71f46abc6c in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007c71f46c3901 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00007c71e8889ee7 in ggml_vk_submit(std::shared_ptr<vk_context_struct>&, vk::Fence) [clone .cold] () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-vulkan.so
#10 0x00007c71e895180a in ggml_vk_preallocate_buffers(ggml_backend_vk_context*, std::shared_ptr<vk_context_struct>) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-vulkan.so
#11 0x00007c71e89568f2 in ggml_vk_mul_mat_q_f16(ggml_backend_vk_context*, std::shared_ptr<vk_context_struct>&, ggml_tensor const*, ggml_tensor const*, ggml_tensor*, bool) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-vulkan.so
#12 0x00007c71e898332c in ggml_vk_build_graph(ggml_backend_vk_context*, ggml_cgraph*, int, ggml_tensor*, int, bool, bool, bool) [clone .isra.0] () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-vulkan.so
#13 0x00007c71e8985179 in ggml_backend_vk_graph_compute(ggml_backend*, ggml_cgraph*) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-vulkan.so
#14 0x00007c71f49797df in ggml_backend_sched_graph_compute_async () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libggml-base.so.0
#15 0x00007c71f4ad1fa1 in llama_context::graph_compute(ggml_cgraph*, bool) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libllama.so.0
#16 0x00007c71f4ad23e5 in llama_context::process_ubatch(llama_ubatch const&, llm_graph_type, llama_memory_context_i*, ggml_status&) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libllama.so.0
#17 0x00007c71f4adaf57 in llama_context::decode(llama_batch const&) () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libllama.so.0
#18 0x00007c71f4adc6f0 in llama_decode () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libllama.so.0
#19 0x00007c71f4d8fe38 in mtmd_helper_decode_image_chunk () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libmtmd.so.0
#20 0x00007c71f4d910ee in mtmd_helper_eval_chunk_single () from /home/belirsiz-adam/Documents/yedek/llama-b9251-bin-ubuntu-vulkan-x64/llama-b9251/libmtmd.so.0
#21 0x000059a0f4aed165 in server_tokens::process_chunk(llama_context*, mtmd_context*, unsigned long, int, int, unsigned long&) const ()
#22 0x000059a0f4b3f8f4 in server_context_impl::update_slots() ()
#23 0x000059a0f4bbc961 in server_queue::start_loop(long) ()
#24 0x000059a0f4a961ef in main ()
[Inferior 1 (process 20615) detached]
terminate called after throwing an instance of 'vk::DeviceLostError'
  what():  vk::Queue::submit: ErrorDeviceLost
Aborted                    (core dumped) RADV_PERFTEST=nogttspill ./llama-server -m "/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/Qwen3.6-27B-UD-Q4_K_XL.gguf" --mmproj "/home/belirsiz-adam/.cache/huggingface/hub/models--unsloth--Qwen3.6-27B-MTP-GGUF/snapshots/b3a58239d8d40b953e34936c9afeb28baa518230/mmproj-BF16.gguf" -fa on --ctx-size 24576 -ngl auto --parallel 1 --spec-type draft-mtp --spec-draft-n-max 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: Core dumped when running qwen 3.6 27b on vulkan. No-mmproj works #23430

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Eval bug: Core dumped when running qwen 3.6 27b on vulkan. No-mmproj works #23430

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions