Skip to content

UPSTREAM PR #17637: vulkan: Replace deprecated VK_EXT_validation_features#383

Open
loci-dev wants to merge 1 commit into
mainfrom
upstream-PR17637-branch_rillomas-replace-validation-features
Open

UPSTREAM PR #17637: vulkan: Replace deprecated VK_EXT_validation_features#383
loci-dev wants to merge 1 commit into
mainfrom
upstream-PR17637-branch_rillomas-replace-validation-features

Conversation

@loci-dev

@loci-dev loci-dev commented Dec 1, 2025

Copy link
Copy Markdown

Mirrored from ggml-org/llama.cpp#17637

When enabling -DGGML_VULKAN_VALIDATE=ON I see lots of validation warnings cluttering the output. I want to remove at least one of them by replacing VK_EXT_validation_features with VK_EXT_layer_settings. This should also make it easier to enable the Debug Printf feature which makes debugging the shader much easier (at least for me)

Before
λ build_vk_validation_master\bin\Debug\test-backend-ops.exe -o TOPK_MOE
ggml_vulkan: Validation layers enabled
Validation Warning: [ BestPractices-deprecated-extension ] | MessageID = 0xda8260ba
vkCreateInstance(): Attempting to enable deprecated extension VK_EXT_validation_features, but this extension has been deprecated by VK_EXT_layer_settings.

Validation Warning: [ BestPractices-specialuse-extension ] | MessageID = 0x675dc32e
vkCreateInstance(): Attempting to enable extension VK_EXT_validation_features, but this extension is intended to support use by applications when debugging and it is strongly recommended that it be otherwise avoided.

ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Intel(R) Arc(TM) B580 Graphics (Intel Corporation) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
register_backend: registered backend Vulkan (1 devices)
register_device: registered device Vulkan0 (Intel(R) Arc(TM) B580 Graphics)
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (13th Gen Intel(R) Core(TM) i9-13900K)
load_backend: failed to find ggml_backend_init in C:\Users\ae\Documents\mnakasak\repo\llama.cpp_mine\build_vk_validation_master\bin\Debug\ggml-vulkan.dll
load_backend: failed to find ggml_backend_init in C:\Users\ae\Documents\mnakasak\repo\llama.cpp_mine\build_vk_validation_master\bin\Debug\ggml-cpu.dll
Testing 2 devices

Validation Warning: [ BestPractices-deprecated-extension ] | MessageID = 0xda8260ba
vkCreateDevice(): Attempting to enable deprecated extension VK_EXT_pipeline_robustness, but this extension has been promoted to 1.4.0 (0x00404000).
Objects: 1
    [0] VkInstance 0x25a1c478ab0

Validation Warning: [ BestPractices-deprecated-extension ] | MessageID = 0xda8260ba
vkCreateDevice(): Attempting to enable deprecated extension VK_KHR_maintenance4, but this extension has been promoted to 1.3.0 (0x00403000).
Objects: 1
    [0] VkInstance 0x25a1c478ab0

Validation Warning: [ BestPractices-deprecated-extension ] | MessageID = 0xda8260ba
vkCreateDevice(): Attempting to enable deprecated extension VK_KHR_shader_integer_dot_product, but this extension has been promoted to 1.3.0 (0x00403000). Objects: 1
    [0] VkInstance 0x25a1c478ab0

Validation Warning: [ BestPractices-specialuse-extension ] | MessageID = 0x675dc32e
vkCreateDevice(): Attempting to enable extension VK_KHR_pipeline_executable_properties, but this extension is intended to support developer tools such as capture-replay libraries and it is strongly recommended that it be otherwise avoided.
Objects: 1
    [0] VkInstance 0x25a1c478ab0

Validation Warning: [ BestPractices-deprecated-extension ] | MessageID = 0xda8260ba
vkCreateDevice(): Attempting to enable deprecated extension VK_EXT_subgroup_size_control, but this extension has been promoted to 1.3.0 (0x00403000).
Objects: 1
    [0] VkInstance 0x25a1c478ab0

Validation Warning: [ BestPractices-deprecated-extension ] | MessageID = 0xda8260ba
vkCreateDevice(): Attempting to enable deprecated extension VK_KHR_16bit_storage, but this extension has been promoted to 1.1.0 (0x00401000).
Objects: 1
    [0] VkInstance 0x25a1c478ab0

Validation Warning: [ BestPractices-deprecated-extension ] | MessageID = 0xda8260ba
vkCreateDevice(): Attempting to enable deprecated extension VK_KHR_shader_non_semantic_info, but this extension has been promoted to 1.3.0 (0x00403000).
Objects: 1
    [0] VkInstance 0x25a1c478ab0

Validation Warning: [ BestPractices-deprecated-extension ] | MessageID = 0xda8260ba
vkCreateDevice(): Attempting to enable deprecated extension VK_KHR_shader_float16_int8, but this extension has been promoted to 1.2.0 (0x00402000).
Objects: 1
    [0] VkInstance 0x25a1c478ab0

Backend 1/2: Vulkan0
  Device description: Intel(R) Arc(TM) B580 Graphics
  Device memory: 12112 MB (11343 MB free)

Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[8,22,1,1],n_expert_used=4,with_norm=0,delayed_softmax=0): OK
Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[32,22,1,1],n_expert_used=8,with_norm=0,delayed_softmax=0): OK
Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[128,1,1,1],n_expert_used=128,with_norm=0,delayed_softmax=0): OK
Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[8,22,1,1],n_expert_used=4,with_norm=1,delayed_softmax=0): OK
Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[32,22,1,1],n_expert_used=8,with_norm=1,delayed_softmax=0): OK
Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[128,1,1,1],n_expert_used=128,with_norm=1,delayed_softmax=0): OK
Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[8,22,1,1],n_expert_used=4,with_norm=0,delayed_softmax=1): OK
Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[32,22,1,1],n_expert_used=8,with_norm=0,delayed_softmax=1): OK
  8/8 tests passed
  Backend Vulkan0: OK
Backend 2/2: CPU
  Skipping CPU backend
2/2 backends passed
OK

After
λ build_vk_validation_mine\bin\Debug\test-backend-ops.exe -o TOPK_MOE
ggml_vulkan: Validation layers enabled
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Intel(R) Arc(TM) B580 Graphics (Intel Corporation) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 32768 | int dot: 1 | matrix cores: KHR_coopmat
register_backend: registered backend Vulkan (1 devices)
register_device: registered device Vulkan0 (Intel(R) Arc(TM) B580 Graphics)
register_backend: registered backend CPU (1 devices)
register_device: registered device CPU (13th Gen Intel(R) Core(TM) i9-13900K)
load_backend: failed to find ggml_backend_init in C:\Users\ae\Documents\mnakasak\repo\llama.cpp_mine\build_vk_validation_mine\bin\Debug\ggml-vulkan.dll
load_backend: failed to find ggml_backend_init in C:\Users\ae\Documents\mnakasak\repo\llama.cpp_mine\build_vk_validation_mine\bin\Debug\ggml-cpu.dll
Testing 2 devices

Validation Warning: [ BestPractices-deprecated-extension ] | MessageID = 0xda8260ba
vkCreateDevice(): Attempting to enable deprecated extension VK_EXT_pipeline_robustness, but this extension has been promoted to 1.4.0 (0x00404000).
Objects: 1
    [0] VkInstance 0x2966a346b80

Validation Warning: [ BestPractices-deprecated-extension ] | MessageID = 0xda8260ba
vkCreateDevice(): Attempting to enable deprecated extension VK_KHR_maintenance4, but this extension has been promoted to 1.3.0 (0x00403000).
Objects: 1
    [0] VkInstance 0x2966a346b80

Validation Warning: [ BestPractices-deprecated-extension ] | MessageID = 0xda8260ba
vkCreateDevice(): Attempting to enable deprecated extension VK_KHR_shader_integer_dot_product, but this extension has been promoted to 1.3.0 (0x00403000). Objects: 1
    [0] VkInstance 0x2966a346b80

Validation Warning: [ BestPractices-specialuse-extension ] | MessageID = 0x675dc32e
vkCreateDevice(): Attempting to enable extension VK_KHR_pipeline_executable_properties, but this extension is intended to support developer tools such as capture-replay libraries and it is strongly recommended that it be otherwise avoided.
Objects: 1
    [0] VkInstance 0x2966a346b80

Validation Warning: [ BestPractices-deprecated-extension ] | MessageID = 0xda8260ba
vkCreateDevice(): Attempting to enable deprecated extension VK_EXT_subgroup_size_control, but this extension has been promoted to 1.3.0 (0x00403000).
Objects: 1
    [0] VkInstance 0x2966a346b80

Validation Warning: [ BestPractices-deprecated-extension ] | MessageID = 0xda8260ba
vkCreateDevice(): Attempting to enable deprecated extension VK_KHR_16bit_storage, but this extension has been promoted to 1.1.0 (0x00401000).
Objects: 1
    [0] VkInstance 0x2966a346b80

Validation Warning: [ BestPractices-deprecated-extension ] | MessageID = 0xda8260ba
vkCreateDevice(): Attempting to enable deprecated extension VK_KHR_shader_non_semantic_info, but this extension has been promoted to 1.3.0 (0x00403000).
Objects: 1
    [0] VkInstance 0x2966a346b80

Validation Warning: [ BestPractices-deprecated-extension ] | MessageID = 0xda8260ba
vkCreateDevice(): Attempting to enable deprecated extension VK_KHR_shader_float16_int8, but this extension has been promoted to 1.2.0 (0x00402000).
Objects: 1
    [0] VkInstance 0x2966a346b80

Backend 1/2: Vulkan0
  Device description: Intel(R) Arc(TM) B580 Graphics
  Device memory: 12112 MB (11343 MB free)

Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[8,22,1,1],n_expert_used=4,with_norm=0,delayed_softmax=0): OK
Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[32,22,1,1],n_expert_used=8,with_norm=0,delayed_softmax=0): OK
Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[128,1,1,1],n_expert_used=128,with_norm=0,delayed_softmax=0): OK
Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[8,22,1,1],n_expert_used=4,with_norm=1,delayed_softmax=0): OK
Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[32,22,1,1],n_expert_used=8,with_norm=1,delayed_softmax=0): OK
Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[128,1,1,1],n_expert_used=128,with_norm=1,delayed_softmax=0): OK
Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[8,22,1,1],n_expert_used=4,with_norm=0,delayed_softmax=1): OK
Validation Warning: [ BestPractices-SpirvDeprecated_WorkgroupSize ] | MessageID = 0xd8a870c
vkCreateComputePipelines(): pCreateInfos[0].stage is using the SPIR-V Workgroup built-in which SPIR-V 1.6 deprecated. When using VK_KHR_maintenance4 or Vulkan 1.3+, the new SPIR-V LocalSizeId execution mode should be used instead. This can be done by recompiling your shader and targeting Vulkan 1.3+.

  TOPK_MOE(ne=[32,22,1,1],n_expert_used=8,with_norm=0,delayed_softmax=1): OK
  8/8 tests passed
  Backend Vulkan0: OK
Backend 2/2: CPU
  Skipping CPU backend
2/2 backends passed
OK

@loci-review

loci-review Bot commented Dec 1, 2025

Copy link
Copy Markdown

Explore the complete analysis inside the Version Insights

Performance Analysis Summary - PR #383

Analysis Overview

This PR modernizes Vulkan validation layer configuration by replacing the deprecated VK_EXT_validation_features extension with VK_EXT_layer_settings. The change affects a single file (ggml-vulkan.cpp) in the Vulkan backend initialization path.

Performance Impact

Power Consumption Analysis:
All 16 binaries show zero measurable change in power consumption. The largest absolute deltas are within measurement noise: libllama.so (+0.95 nJ), llama-run (-0.51 nJ), llama-tts (+0.21 nJ). No binary exhibits performance regression.

Function-Level Analysis:
No functions show measurable changes in response time or throughput. The summary report returned no data for function-level performance deltas, indicating the versions are functionally identical at the performance measurement level.

Inference Impact:
Zero impact on tokens per second. The changes are isolated to Vulkan instance initialization code executed once at startup. Core inference functions (llama_decode, llama_encode, llama_tokenize) remain unmodified with no changes in response time or throughput. The initialization overhead is negligible (under 1 microsecond) and does not affect the inference hot path.

Code Changes:
The PR replaces deprecated Vulkan API calls with modern equivalents, adding 11 lines to configure layer settings via the new extension. The changes maintain identical validation behavior while eliminating deprecation warnings during development builds. Production builds with validation disabled experience zero impact.

@loci-dev loci-dev force-pushed the upstream-PR17637-branch_rillomas-replace-validation-features branch from 3db03c6 to 82b3726 Compare December 1, 2025 06:45
@loci-review

loci-review Bot commented Dec 1, 2025

Copy link
Copy Markdown

Explore the complete analysis inside the Version Insights

Performance Analysis Summary: PR #383

Overview

This PR replaces the deprecated Vulkan extension VK_EXT_validation_features with VK_EXT_layer_settings in the Vulkan backend initialization code. The changes affect only validation layer configuration, active exclusively when GGML_VULKAN_VALIDATE=ON is set at compile time. Analysis shows no measurable performance impact on production inference paths.

Performance Impact

Inference Performance: No impact. The modified functions ggml_vk_instance_init and ggml_vk_instance_layer_settings_available execute once during process initialization, not during inference. Core inference functions remain unchanged:

  • llama_decode: 0 ns change
  • llama_encode: 0 ns change
  • llama_tokenize: 0 ns change

Tokens Per Second: No degradation expected. Since tokenization and inference functions show zero response time or throughput changes, the model performance remains identical to baseline.

Power Consumption: Negligible change across all binaries. Measured impact <0.001% with build.bin.libllama.so showing +0.00025% (+0.48 nJ) variation, within measurement noise. The compiler generated functionally equivalent assembly code.

Modified Functions:

  • ggml_vk_instance_init: Initialization-only function, adds layer settings structure (11 lines)
  • ggml_vk_instance_layer_settings_available: Extension check function, renamed from validation_ext variant

Both functions execute outside inference hot paths and contribute zero overhead to token processing throughput.

@loci-dev loci-dev force-pushed the main branch 24 times, most recently from 38683c7 to fa6cdcc Compare December 3, 2025 09:11
@loci-dev loci-dev force-pushed the main branch 30 times, most recently from 84f6117 to 91eb894 Compare December 7, 2025 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants