Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

vulkan: Intel Xe flash attention, GEMM optimizations, and optional weight compression (Xe-LPG Plus/Xe2/Xe3) [MEGA PR] examples ggml changes relating to the ggml tensor library for machine learning model Model specific Vulkan Issues specific to the Vulkan backend
#24408 opened Jun 10, 2026 by fish-jiang Draft
vulkan: GEMM/Group GEMM optimizations and optional load-time weight compression for Intel MoE path (3/3, Xe-LPG Plus/Xe2/Xe3) examples ggml changes relating to the ggml tensor library for machine learning model Model specific Vulkan Issues specific to the Vulkan backend
#24407 opened Jun 10, 2026 by fish-jiang Draft
vulkan: add Intel Xe flash attention optimization kernels (2/3, Xe-LPG Plus/Xe2/Xe3) ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#24406 opened Jun 10, 2026 by fish-jiang Draft
gguf : add tensor shape accessors ggml changes relating to the ggml tensor library for machine learning testing Everything test related
#24405 opened Jun 10, 2026 by QuintinShaw Loading…
vulkan: add INTEL_PRE_XE2 arch enum and enable coopmat1 on Intel Xe-LPG Plus (1/3, Xe-LPG Plus) ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#24404 opened Jun 10, 2026 by fish-jiang Draft
CUDA: extend K-type validation to V-types for flash attention ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#24403 opened Jun 10, 2026 by sanmai Contributor Loading…
vendor : update LibreSSL to 4.3.2
#24397 opened Jun 10, 2026 by angt Member Loading…
vendor : update cpp-httplib to 0.47.0 python python script changes script Script related
#24395 opened Jun 10, 2026 by angt Member Loading…
hexagon: store HMX flash-attention softmax accumulators in FP32 ggml changes relating to the ggml tensor library for machine learning Hexagon
#24389 opened Jun 10, 2026 by njsyw1997 Contributor Draft
[SYCL] Fix CI build & release for SYCL backend devops improvements to build systems and github actions
#24387 opened Jun 10, 2026 by arthw Contributor Loading…
ggml: tune RDNA4 MMVQ warps for K-quants ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#24386 opened Jun 10, 2026 by ammarwa Loading…
mtmd: add batching API examples server
#24384 opened Jun 9, 2026 by ngxson Collaborator Draft
6 tasks
chat: fix LFM2/LFM2.5 ignoring json_schema
#24377 opened Jun 9, 2026 by tdakhran Contributor Loading…
server: avoid forwarding auth headers in CORS proxy examples python python script changes server
#24373 opened Jun 9, 2026 by ItsMatti4 Loading…
vocab : refactor normalizer flags into options struct, add strip_accents merge ready A maintainer can use this label to indicate that they consider the changes final and ready to merge. python python script changes
#24371 opened Jun 9, 2026 by o7si Contributor Loading…
metal : wind down leftover residency sets at teardown instead of aborting Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning
#24368 opened Jun 9, 2026 by AlexCherrypi Loading…
Force NVFP4 W4A8 path for NVFP4_W4A16 layers on Blackwell, where NVFP4 normally uses the native W4A4 path. ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs python python script changes testing Everything test related
#24364 opened Jun 9, 2026 by ynankani Contributor Loading…
[SYCL] Support OP EXPM1, support all UT cases of FLOOR, TRUNC, ROUND documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning merge ready A maintainer can use this label to indicate that they consider the changes final and ready to merge. SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
#24363 opened Jun 9, 2026 by arthw Contributor Loading…
vulkan: disable FA mask_opt on GCN to improve performance ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#24362 opened Jun 9, 2026 by 0cc4m Contributor Loading…
mtmd, llama: shared backend sched examples server
#24361 opened Jun 9, 2026 by ngxson Collaborator Draft
CUDA: Fix ssm_scan_f32 data-races ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#24360 opened Jun 9, 2026 by ORippler Collaborator Loading…
ggml-zendnn : fix DL backend loading for Ollama AMD ZenDNN Issues related to the AMD ZenDNN backend ggml changes relating to the ggml tensor library for machine learning
#24342 opened Jun 9, 2026 by z-sachin Contributor Loading…
ProTip! Type g i on any issue or pull request to go back to the issue listing page.