Skip to content

Conversation

PlaidCat
Copy link
Collaborator

@PlaidCat PlaidCat commented Sep 3, 2025

This is the attempt at a re-builder built on Cron and some internal tools, but the same process is as follows as previous rebuilds

  • Download all unprocessed src.rpm
  • for each src,pm
    • Find all commits in changelog up to last known tag ... in this case 5.14.0-570
    • Re-play commits in reverse order (oldest in change log to newest) with git cherry-pick
    • After replay replace ENTIRE code in branch with rpmbuild -bp from corresponding src.rpm.
    • Tag Rebuild branch

Rebuild Splat Inspection

kernel-5.14.0-570.39.1.el9_6

[jmaple@devbox kernel-src-tree]$ cat ciq/ciq_backports/kernel-5.14.0-570.39.1.el9_6/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..kernel-mainline: 323388
Number of commits in rpm: 37
Number of commits matched with upstream: 34 (91.89%)
Number of commits in upstream but not in rpm: 323354
Number of commits NOT found in upstream: 3 (8.11%)

Rebuilding Kernel on Branch rocky9_6_rebuild_kernel-5.14.0-570.39.1.el9_6 for kernel-5.14.0-570.39.1.el9_6
Clean Cherry Picks: 23 (67.65%)
Empty Cherry Picks: 5 (14.71%)
_______________________________

__EMPTY COMMITS__________________________
95d2b9f693ff2a1180a23d7d59acc0c4e72f4c41 Revert "smb: client: fix TCP timers deadlock after rmmod"
0d48566d4b58946c8e1b0baac0347616060a81c9 s390/pci: rename lock member in struct zpci_dev
bcb5d6c769039c8358a2359e7c3ea5d97ce93108 s390/pci: introduce lock to synchronize state of zpci_dev's
05a2538f2b48500cf4e8a0a0ce76623cc5bafcf1 s390/pci: Fix duplicate pci_dev_put() in disable_slot() when PF has child VFs
47c397844869ad0e6738afb5879c7492f4691122 s390/pci: Prevent self deletion in disable_slot()

__CHANGES NOT IN UPSTREAM________________
Porting to Rocky Linux 9, debranding and Rocky branding'
Ensure aarch64 kernel is not compressed'
Merge: net: mana: Fix race of mana_hwc_post_rx_wqe and new hwc response [rhel-9.6.z]

kernel-5.14.0-570.37.1.el9_6

[jmaple@devbox kernel-src-tree]$ cat ciq/ciq_backports/kernel-5.14.0-570.37.1.el9_6/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..kernel-mainline: 323388
Number of commits in rpm: 78
Number of commits matched with upstream: 74 (94.87%)
Number of commits in upstream but not in rpm: 323314
Number of commits NOT found in upstream: 4 (5.13%)

Rebuilding Kernel on Branch rocky9_6_rebuild_kernel-5.14.0-570.37.1.el9_6 for kernel-5.14.0-570.37.1.el9_6
Clean Cherry Picks: 39 (52.70%)
Empty Cherry Picks: 28 (37.84%)
_______________________________

__EMPTY COMMITS__________________________
ee40c9920ac286c5bfe7c811e66ff899266d2582 mm: fix copy_vma() error handling for hugetlb mappings
081056dc00a27bccb55ccc3c6f230a3d5fd3f7e0 mm/hugetlb: unshare page tables during VMA split, not before
1013af4f585fccc4d3e5c5824d174de2257f7d6d mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race
5b3eaf10e2e0a3df5c8dfd6aabc6aec435383ba0 x86/platform/atom: Switch to new Intel CPU model defines
0e7b7bde460304f44e8c6b212c3195ac2f69f6fe kselftest: Move ksft helper module to common directory
170c966cbe274e664288cfc12ee919d5e706dc50 selftests: ksft: Fix finished() helper exit code on skipped tests
f6d9883f8e680460be4714d4d35c7acac1dffeaf tools/include: Sync x86 headers with the kernel sources
104edc6efca628389295392ceb87623fe10c41f6 x86/cpufeatures: Rename X86_FEATURE_FAST_CPPC to have AMD prefix
1ad466706671436994ec7e71305f44692fed989a x86/cpufeatures: Add X86_FEATURE_AMD_HETEROGENEOUS_CORES
85b08180df07b9a5984b15ae31d76b904d42a115 x86/cpu: Expose only stepping min/max interface
74864403c578d9caa92fb9b2743a4b7aa5240e44 selftests: Warn about skipped tests in result summary
c3390406adc62dd2d42eb522e1ce124fa43c5dec x86/cpu: Shorten CPU matching macro
00d7fc04b703eb3e9d61dd3eac02a34c466e9f12 x86/cpu: Add cpu_type to struct x86_cpu_id
722fa0dba74f206999244facb177a8bfe3d513e6 x86/rfds: Exclude P-only parts from the RFDS affected list
d4e89d212d401672e9cdfe825d947ee3a9fbe3f5 x86/bpf: Call branch history clearing sequence on exit
9f725eec8fc0b39bdc07dcc8897283c367c1a163 x86/bpf: Add IBHF call at end of classic BPF
073fdbe02c69c43fb7c0d547ec265c7747d4a646 x86/bhi: Do not set BHI_DIS_S in 32-bit mode
1ac116ce6468670eeda39345a5585df308243dca Documentation: x86/bugs/its: Add ITS documentation
159013a7ca18c271ff64192deb62a689b622d860 x86/its: Enumerate Indirect Target Selection (ITS) bug
8754e67ad4ac692c67ff1f99c0d07156f04ae40c x86/its: Add support for ITS-safe indirect thunk
a75bf27fe41abe658c53276a0c486c4bf9adecfc x86/its: Add support for ITS-safe return thunk
f4818881c47fd91fcb6d62373c57c7844e3de1c0 x86/its: Enable Indirect Target Selection mitigation
2665281a07e19550944e8354a2024635a7b2714a x86/its: Add "vmexit" option to skip mitigation on some CPUs
facd226f7e0c8ca936ac114aba43cb3e8b94e41e x86/its: Add support for RSB stuffing mitigation
ebebe30794d38c51f71fe4951ba6af4159d9837d x86/ibt: Keep IBT disabled during alternative patching
63de8abd97ddb9b758bd8f915ecbd18e1f1a87a0 arm64: insn: Add support for encoding DSB
df207de9d9e7a4d92f8567e2c539d9c8c12fd99d udp: Fix memory accounting leak.
4f5a52adeb1ad675ca33f1e1eacd9c0bbaf393d4 ethtool: Fix set RXNFC command with symmetric RSS hash

__CHANGES NOT IN UPSTREAM________________
Porting to Rocky Linux 9, debranding and Rocky branding'
Ensure aarch64 kernel is not compressed'
redhat: Mark kernel incompatible with xdp-tools<1.5.4
redhat/configs: Enable CONFIG_MITIGATION_ITS for x86

Build

[jmaple@devbox code]$ egrep -B 5 -A 5 "\[TIMER\]|^Starting Build" $(ls -t kbuild* | head -n1)
/mnt/code/kernel-src-tree
Running make mrproper...
  CLEAN   scripts/basic
  CLEAN   scripts/kconfig
  CLEAN   include/config include/generated
[TIMER]{MRPROPER}: 5s
x86_64 architecture detected, copying config
'configs/kernel-x86_64-rhel.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-rocky9_6_rebuild-1b9ea68b26cf"
Making olddefconfig
--
  HOSTCC  scripts/kconfig/util.o
  HOSTLD  scripts/kconfig/conf
#
# configuration written to .config
#
Starting Build
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_64.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_32.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_x32.h
  SYSTBL  arch/x86/include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_32_ia32.h
--
  LD [M]  sound/xen/snd_xen_front.ko
  BTF [M] sound/usb/usx2y/snd-usb-usx2y.ko
  BTF [M] sound/virtio/virtio_snd.ko
  BTF [M] sound/x86/snd-hdmi-lpe-audio.ko
  BTF [M] sound/xen/snd_xen_front.ko
[TIMER]{BUILD}: 1551s
Making Modules
  INSTALL /lib/modules/5.14.0-rocky9_6_rebuild-1b9ea68b26cf/kernel/arch/x86/crypto/blake2s-x86_64.ko
  INSTALL /lib/modules/5.14.0-rocky9_6_rebuild-1b9ea68b26cf/kernel/arch/x86/crypto/blowfish-x86_64.ko
  INSTALL /lib/modules/5.14.0-rocky9_6_rebuild-1b9ea68b26cf/kernel/arch/x86/crypto/camellia-aesni-avx-x86_64.ko
  INSTALL /lib/modules/5.14.0-rocky9_6_rebuild-1b9ea68b26cf/kernel/arch/x86/crypto/camellia-aesni-avx2.ko
--
  SIGN    /lib/modules/5.14.0-rocky9_6_rebuild-1b9ea68b26cf/kernel/sound/usb/usx2y/snd-usb-us122l.ko
  SIGN    /lib/modules/5.14.0-rocky9_6_rebuild-1b9ea68b26cf/kernel/sound/virtio/virtio_snd.ko
  SIGN    /lib/modules/5.14.0-rocky9_6_rebuild-1b9ea68b26cf/kernel/sound/x86/snd-hdmi-lpe-audio.ko
  SIGN    /lib/modules/5.14.0-rocky9_6_rebuild-1b9ea68b26cf/kernel/sound/xen/snd_xen_front.ko
  DEPMOD  /lib/modules/5.14.0-rocky9_6_rebuild-1b9ea68b26cf
[TIMER]{MODULES}: 8s
Making Install
sh ./arch/x86/boot/install.sh 5.14.0-rocky9_6_rebuild-1b9ea68b26cf \
        arch/x86/boot/bzImage System.map "/boot"
[TIMER]{INSTALL}: 23s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-5.14.0-rocky9_6_rebuild-ee328fded72f and Index to 3
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 5s
[TIMER]{BUILD}: 1551s
[TIMER]{MODULES}: 8s
[TIMER]{INSTALL}: 23s
[TIMER]{TOTAL} 1593s
Rebooting in 10 seconds

KselfTests

[jmaple@devbox code]$ ls -rt kselftest.* | tail -n4 | while read line; do echo $line; grep '^ok ' $line | wc -l ; done
kselftest.5.14.0-rocky9_6_rebuild-9fbeb8c24bbd.log
317
kselftest.5.14.0-570.32.1.el9_6.x86_64.log
317
kselftest.5.14.0-rocky9_6_rebuild-e0a1a84bc26b.log
317
kselftest.5.14.0-rocky9_6_rebuild-ee328fded72f.log
317

jira LE-4018
cve CVE-2025-38250
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Kuniyuki Iwashima <[email protected]>
commit 1d61231

syzbot reported use-after-free in vhci_flush() without repro. [0]

From the splat, a thread close()d a vhci file descriptor while
its device was being used by iotcl() on another thread.

Once the last fd refcnt is released, vhci_release() calls
hci_unregister_dev(), hci_free_dev(), and kfree() for struct
vhci_data, which is set to hci_dev->dev->driver_data.

The problem is that there is no synchronisation after unlinking
hdev from hci_dev_list in hci_unregister_dev().  There might be
another thread still accessing the hdev which was fetched before
the unlink operation.

We can use SRCU for such synchronisation.

Let's run hci_dev_reset() under SRCU and wait for its completion
in hci_unregister_dev().

Another option would be to restore hci_dev->destruct(), which was
removed in commit 587ae08 ("Bluetooth: Remove unused
hci-destruct cb").  However, this would not be a good solution, as
we should not run hci_unregister_dev() while there are in-flight
ioctl() requests, which could lead to another data-race KCSAN splat.

Note that other drivers seem to have the same problem, for exmaple,
virtbt_remove().

[0]:
BUG: KASAN: slab-use-after-free in skb_queue_empty_lockless include/linux/skbuff.h:1891 [inline]
BUG: KASAN: slab-use-after-free in skb_queue_purge_reason+0x99/0x360 net/core/skbuff.c:3937
Read of size 8 at addr ffff88807cb8d858 by task syz.1.219/6718

CPU: 1 UID: 0 PID: 6718 Comm: syz.1.219 Not tainted 6.16.0-rc1-syzkaller-00196-g08207f42d3ff #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
Call Trace:
 <TASK>
 dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
 print_address_description mm/kasan/report.c:408 [inline]
 print_report+0xd2/0x2b0 mm/kasan/report.c:521
 kasan_report+0x118/0x150 mm/kasan/report.c:634
 skb_queue_empty_lockless include/linux/skbuff.h:1891 [inline]
 skb_queue_purge_reason+0x99/0x360 net/core/skbuff.c:3937
 skb_queue_purge include/linux/skbuff.h:3368 [inline]
 vhci_flush+0x44/0x50 drivers/bluetooth/hci_vhci.c:69
 hci_dev_do_reset net/bluetooth/hci_core.c:552 [inline]
 hci_dev_reset+0x420/0x5c0 net/bluetooth/hci_core.c:592
 sock_do_ioctl+0xd9/0x300 net/socket.c:1190
 sock_ioctl+0x576/0x790 net/socket.c:1311
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:907 [inline]
 __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fcf5b98e929
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fcf5c7b9038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fcf5bbb6160 RCX: 00007fcf5b98e929
RDX: 0000000000000000 RSI: 00000000400448cb RDI: 0000000000000009
RBP: 00007fcf5ba10b39 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007fcf5bbb6160 R15: 00007ffd6353d528
 </TASK>

Allocated by task 6535:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3e/0x80 mm/kasan/common.c:68
 poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
 __kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:394
 kasan_kmalloc include/linux/kasan.h:260 [inline]
 __kmalloc_cache_noprof+0x230/0x3d0 mm/slub.c:4359
 kmalloc_noprof include/linux/slab.h:905 [inline]
 kzalloc_noprof include/linux/slab.h:1039 [inline]
 vhci_open+0x57/0x360 drivers/bluetooth/hci_vhci.c:635
 misc_open+0x2bc/0x330 drivers/char/misc.c:161
 chrdev_open+0x4c9/0x5e0 fs/char_dev.c:414
 do_dentry_open+0xdf0/0x1970 fs/open.c:964
 vfs_open+0x3b/0x340 fs/open.c:1094
 do_open fs/namei.c:3887 [inline]
 path_openat+0x2ee5/0x3830 fs/namei.c:4046
 do_filp_open+0x1fa/0x410 fs/namei.c:4073
 do_sys_openat2+0x121/0x1c0 fs/open.c:1437
 do_sys_open fs/open.c:1452 [inline]
 __do_sys_openat fs/open.c:1468 [inline]
 __se_sys_openat fs/open.c:1463 [inline]
 __x64_sys_openat+0x138/0x170 fs/open.c:1463
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 6535:
 kasan_save_stack mm/kasan/common.c:47 [inline]
 kasan_save_track+0x3e/0x80 mm/kasan/common.c:68
 kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:576
 poison_slab_object mm/kasan/common.c:247 [inline]
 __kasan_slab_free+0x62/0x70 mm/kasan/common.c:264
 kasan_slab_free include/linux/kasan.h:233 [inline]
 slab_free_hook mm/slub.c:2381 [inline]
 slab_free mm/slub.c:4643 [inline]
 kfree+0x18e/0x440 mm/slub.c:4842
 vhci_release+0xbc/0xd0 drivers/bluetooth/hci_vhci.c:671
 __fput+0x44c/0xa70 fs/file_table.c:465
 task_work_run+0x1d1/0x260 kernel/task_work.c:227
 exit_task_work include/linux/task_work.h:40 [inline]
 do_exit+0x6ad/0x22e0 kernel/exit.c:955
 do_group_exit+0x21c/0x2d0 kernel/exit.c:1104
 __do_sys_exit_group kernel/exit.c:1115 [inline]
 __se_sys_exit_group kernel/exit.c:1113 [inline]
 __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1113
 x64_sys_call+0x21ba/0x21c0 arch/x86/include/generated/asm/syscalls_64.h:232
 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

The buggy address belongs to the object at ffff88807cb8d800
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 88 bytes inside of
 freed 1024-byte region [ffff88807cb8d800, ffff88807cb8dc00)

Fixes: bf18c71 ("Bluetooth: vhci: Free driver_data on file release")
	Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=f62d64848fc4c7c30cd6
	Signed-off-by: Kuniyuki Iwashima <[email protected]>
	Acked-by: Paul Menzel <[email protected]>
	Signed-off-by: Luiz Augusto von Dentz <[email protected]>
(cherry picked from commit 1d61231)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Ricardo Cañuelo Navarro <[email protected]>
commit ee40c99
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.37.1.el9_6/ee40c992.failed

If, during a mremap() operation for a hugetlb-backed memory mapping,
copy_vma() fails after the source vma has been duplicated and opened (ie.
vma_link() fails), the error is handled by closing the new vma.  This
updates the hugetlbfs reservation counter of the reservation map which at
this point is referenced by both the source vma and the new copy.  As a
result, once the new vma has been freed and copy_vma() returns, the
reservation counter for the source vma will be incorrect.

This patch addresses this corner case by clearing the hugetlb private page
reservation reference for the new vma and decrementing the reference
before closing the vma, so that vma_close() won't update the reservation
counter.  This is also what copy_vma_and_data() does with the source vma
if copy_vma() succeeds, so a helper function has been added to do the
fixup in both functions.

The issue was reported by a private syzbot instance and can be reproduced
using the C reproducer in [1].  It's also a possible duplicate of public
syzbot report [2].  The WARNING report is:

============================================================
page_counter underflow: -1024 nr_pages=1024
WARNING: CPU: 0 PID: 3287 at mm/page_counter.c:61 page_counter_cancel+0xf6/0x120
Modules linked in:
CPU: 0 UID: 0 PID: 3287 Comm: repro__WARNING_ Not tainted 6.15.0-rc7+ #54 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014
RIP: 0010:page_counter_cancel+0xf6/0x120
Code: ff 5b 41 5e 41 5f 5d c3 cc cc cc cc e8 f3 4f 8f ff c6 05 64 01 27 06 01 48 c7 c7 60 15 f8 85 48 89 de 4c 89 fa e8 2a a7 51 ff <0f> 0b e9 66 ff ff ff 44 89 f9 80 e1 07 38 c1 7c 9d 4c 81
RSP: 0018:ffffc900025df6a0 EFLAGS: 00010246
RAX: 2edfc409ebb44e00 RBX: fffffffffffffc00 RCX: ffff8880155f0000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: dffffc0000000000 R08: ffffffff81c4a23c R09: 1ffff1100330482a
R10: dffffc0000000000 R11: ffffed100330482b R12: 0000000000000000
R13: ffff888058a882c0 R14: ffff888058a882c0 R15: 0000000000000400
FS:  0000000000000000(0000) GS:ffff88808fc53000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004b33e0 CR3: 00000000076d6000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 page_counter_uncharge+0x33/0x80
 hugetlb_cgroup_uncharge_counter+0xcb/0x120
 hugetlb_vm_op_close+0x579/0x960
 ? __pfx_hugetlb_vm_op_close+0x10/0x10
 remove_vma+0x88/0x130
 exit_mmap+0x71e/0xe00
 ? __pfx_exit_mmap+0x10/0x10
 ? __mutex_unlock_slowpath+0x22e/0x7f0
 ? __pfx_exit_aio+0x10/0x10
 ? __up_read+0x256/0x690
 ? uprobe_clear_state+0x274/0x290
 ? mm_update_next_owner+0xa9/0x810
 __mmput+0xc9/0x370
 exit_mm+0x203/0x2f0
 ? __pfx_exit_mm+0x10/0x10
 ? taskstats_exit+0x32b/0xa60
 do_exit+0x921/0x2740
 ? do_raw_spin_lock+0x155/0x3b0
 ? __pfx_do_exit+0x10/0x10
 ? __pfx_do_raw_spin_lock+0x10/0x10
 ? _raw_spin_lock_irq+0xc5/0x100
 do_group_exit+0x20c/0x2c0
 get_signal+0x168c/0x1720
 ? __pfx_get_signal+0x10/0x10
 ? schedule+0x165/0x360
 arch_do_signal_or_restart+0x8e/0x7d0
 ? __pfx_arch_do_signal_or_restart+0x10/0x10
 ? __pfx___se_sys_futex+0x10/0x10
 syscall_exit_to_user_mode+0xb8/0x2c0
 do_syscall_64+0x75/0x120
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x422dcd
Code: Unable to access opcode bytes at 0x422da3.
RSP: 002b:00007ff266cdb208 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: 0000000000000001 RBX: 00007ff266cdbcdc RCX: 0000000000422dcd
RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 00000000004c7bec
RBP: 00007ff266cdb220 R08: 203a6362696c6720 R09: 203a6362696c6720
R10: 0000200000c00000 R11: 0000000000000246 R12: ffffffffffffffd0
R13: 0000000000000002 R14: 00007ffe1cb5f520 R15: 00007ff266cbb000
 </TASK>
============================================================

Link: https://lkml.kernel.org/r/20250523-warning_in_page_counter_cancel-v2-1-b6df1a8cfefd@igalia.com
Link: https://people.igalia.com/rcn/kernel_logs/20250422__WARNING_in_page_counter_cancel__repro.c [1]
Link: https://lore.kernel.org/all/[email protected]/ [2]
	Signed-off-by: Ricardo Cañuelo Navarro <[email protected]>
	Suggested-by: Lorenzo Stoakes <[email protected]>
	Reviewed-by: Liam R. Howlett <[email protected]>
	Cc: Florent Revest <[email protected]>
	Cc: Jann Horn <[email protected]>
	Cc: Oscar Salvador <[email protected]>
	Cc: Vlastimil Babka <[email protected]>
	Cc: <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit ee40c99)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	mm/hugetlb.c
#	mm/mremap.c
#	mm/vma.c
jira LE-4018
cve CVE-2025-38084
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Jann Horn <[email protected]>
commit 081056d
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.37.1.el9_6/081056dc.failed

Currently, __split_vma() triggers hugetlb page table unsharing through
vm_ops->may_split().  This happens before the VMA lock and rmap locks are
taken - which is too early, it allows racing VMA-locked page faults in our
process and racing rmap walks from other processes to cause page tables to
be shared again before we actually perform the split.

Fix it by explicitly calling into the hugetlb unshare logic from
__split_vma() in the same place where THP splitting also happens.  At that
point, both the VMA and the rmap(s) are write-locked.

An annoying detail is that we can now call into the helper
hugetlb_unshare_pmds() from two different locking contexts:

1. from hugetlb_split(), holding:
    - mmap lock (exclusively)
    - VMA lock
    - file rmap lock (exclusively)
2. hugetlb_unshare_all_pmds(), which I think is designed to be able to
   call us with only the mmap lock held (in shared mode), but currently
   only runs while holding mmap lock (exclusively) and VMA lock

Backporting note:
This commit fixes a racy protection that was introduced in commit
b30c14c ("hugetlb: unshare some PMDs when splitting VMAs"); that
commit claimed to fix an issue introduced in 5.13, but it should actually
also go all the way back.

[[email protected]: v2]
  Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 39dde65 ("[PATCH] shared page table for hugetlb page")
	Signed-off-by: Jann Horn <[email protected]>
	Cc: Liam Howlett <[email protected]>
	Reviewed-by: Lorenzo Stoakes <[email protected]>
	Reviewed-by: Oscar Salvador <[email protected]>
	Cc: Lorenzo Stoakes <[email protected]>
	Cc: Vlastimil Babka <[email protected]>
	Cc: <[email protected]>	[b30c14c: hugetlb: unshare some PMDs when splitting VMAs]
	Cc: <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit 081056d)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	include/linux/hugetlb.h
#	mm/vma.c
#	tools/testing/vma/vma_internal.h
jira LE-4018
cve CVE-2025-38085
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Jann Horn <[email protected]>
commit 1013af4
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.37.1.el9_6/1013af4f.failed

huge_pmd_unshare() drops a reference on a page table that may have
previously been shared across processes, potentially turning it into a
normal page table used in another process in which unrelated VMAs can
afterwards be installed.

If this happens in the middle of a concurrent gup_fast(), gup_fast() could
end up walking the page tables of another process.  While I don't see any
way in which that immediately leads to kernel memory corruption, it is
really weird and unexpected.

Fix it with an explicit broadcast IPI through tlb_remove_table_sync_one(),
just like we do in khugepaged when removing page tables for a THP
collapse.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 39dde65 ("[PATCH] shared page table for hugetlb page")
	Signed-off-by: Jann Horn <[email protected]>
	Reviewed-by: Lorenzo Stoakes <[email protected]>
	Cc: Liam Howlett <[email protected]>
	Cc: Muchun Song <[email protected]>
	Cc: Oscar Salvador <[email protected]>
	Cc: Vlastimil Babka <[email protected]>
	Cc: <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit 1013af4)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	mm/hugetlb.c
jira LE-4018
cve CVE-2025-38124
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Shiming Cheng <[email protected]>
commit 3382a1e

Commit a1e40ac ("net: gso: fix udp gso fraglist segmentation after
pull from frag_list") detected invalid geometry in frag_list skbs and
redirects them from skb_segment_list to more robust skb_segment. But some
packets with modified geometry can also hit bugs in that code. We don't
know how many such cases exist. Addressing each one by one also requires
touching the complex skb_segment code, which risks introducing bugs for
other types of skbs. Instead, linearize all these packets that fail the
basic invariants on gso fraglist skbs. That is more robust.

If only part of the fraglist payload is pulled into head_skb, it will
always cause exception when splitting skbs by skb_segment. For detailed
call stack information, see below.

Valid SKB_GSO_FRAGLIST skbs
- consist of two or more segments
- the head_skb holds the protocol headers plus first gso_size
- one or more frag_list skbs hold exactly one segment
- all but the last must be gso_size

Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
modify fraglist skbs, breaking these invariants.

In extreme cases they pull one part of data into skb linear. For UDP,
this  causes three payloads with lengths of (11,11,10) bytes were
pulled tail to become (12,10,10) bytes.

The skbs no longer meets the above SKB_GSO_FRAGLIST conditions because
payload was pulled into head_skb, it needs to be linearized before pass
to regular skb_segment.

    skb_segment+0xcd0/0xd14
    __udp_gso_segment+0x334/0x5f4
    udp4_ufo_fragment+0x118/0x15c
    inet_gso_segment+0x164/0x338
    skb_mac_gso_segment+0xc4/0x13c
    __skb_gso_segment+0xc4/0x124
    validate_xmit_skb+0x9c/0x2c0
    validate_xmit_skb_list+0x4c/0x80
    sch_direct_xmit+0x70/0x404
    __dev_queue_xmit+0x64c/0xe5c
    neigh_resolve_output+0x178/0x1c4
    ip_finish_output2+0x37c/0x47c
    __ip_finish_output+0x194/0x240
    ip_finish_output+0x20/0xf4
    ip_output+0x100/0x1a0
    NF_HOOK+0xc4/0x16c
    ip_forward+0x314/0x32c
    ip_rcv+0x90/0x118
    __netif_receive_skb+0x74/0x124
    process_backlog+0xe8/0x1a4
    __napi_poll+0x5c/0x1f8
    net_rx_action+0x154/0x314
    handle_softirqs+0x154/0x4b8

    [118.376811] [C201134] rxq0_pus: [name:bug&]kernel BUG at net/core/skbuff.c:4278!
    [118.376829] [C201134] rxq0_pus: [name:traps&]Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
    [118.470774] [C201134] rxq0_pus: [name:mrdump&]Kernel Offset: 0x178cc00000 from 0xffffffc008000000
    [118.470810] [C201134] rxq0_pus: [name:mrdump&]PHYS_OFFSET: 0x40000000
    [118.470827] [C201134] rxq0_pus: [name:mrdump&]pstate: 60400005 (nZCv daif +PAN -UAO)
    [118.470848] [C201134] rxq0_pus: [name:mrdump&]pc : [0xffffffd79598aefc] skb_segment+0xcd0/0xd14
    [118.470900] [C201134] rxq0_pus: [name:mrdump&]lr : [0xffffffd79598a5e8] skb_segment+0x3bc/0xd14
    [118.470928] [C201134] rxq0_pus: [name:mrdump&]sp : ffffffc008013770

Fixes: a1e40ac ("gso: fix udp gso fraglist segmentation after pull from frag_list")
	Signed-off-by: Shiming Cheng <[email protected]>
	Reviewed-by: Willem de Bruijn <[email protected]>
	Signed-off-by: David S. Miller <[email protected]>
(cherry picked from commit 3382a1e)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
cve CVE-2025-38471
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Jakub Kicinski <[email protected]>
commit 4ab26bc

After recent changes in net-next TCP compacts skbs much more
aggressively. This unearthed a bug in TLS where we may try
to operate on an old skb when checking if all skbs in the
queue have matching decrypt state and geometry.

    BUG: KASAN: slab-use-after-free in tls_strp_check_rcv+0x898/0x9a0 [tls]
    (net/tls/tls_strp.c:436 net/tls/tls_strp.c:530 net/tls/tls_strp.c:544)
    Read of size 4 at addr ffff888013085750 by task tls/13529

    CPU: 2 UID: 0 PID: 13529 Comm: tls Not tainted 6.16.0-rc5-virtme
    Call Trace:
     kasan_report+0xca/0x100
     tls_strp_check_rcv+0x898/0x9a0 [tls]
     tls_rx_rec_wait+0x2c9/0x8d0 [tls]
     tls_sw_recvmsg+0x40f/0x1aa0 [tls]
     inet_recvmsg+0x1c3/0x1f0

Always reload the queue, fast path is to have the record in the queue
when we wake, anyway (IOW the path going down "if !strp->stm.full_len").

Fixes: 0d87bbd ("tls: strp: make sure the TCP skbs do not have overlapping data")
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 4ab26bc)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
cve CVE-2025-38380
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Michael J. Ruhl <[email protected]>
commit 3d30048

The i2c_dw_xfer_init() function requires msgs and msg_write_idx from the
dev context to be initialized.

amd_i2c_dw_xfer_quirk() inits msgs and msgs_num, but not msg_write_idx.

This could allow an out of bounds access (of msgs).

Initialize msg_write_idx before calling i2c_dw_xfer_init().

	Reviewed-by: Andy Shevchenko <[email protected]>
Fixes: 17631e8 ("i2c: designware: Add driver support for AMD NAVI GPU")
	Cc: <[email protected]> # v5.13+
	Signed-off-by: Michael J. Ruhl <[email protected]>
	Signed-off-by: Andi Shyti <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit 3d30048)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Qiuxu Zhuo <[email protected]>
commit e77086c

The Grand Ridge CPU model uses similar memory controller registers with
Granite Rapids server. Add Grand Ridge CPU model ID for EDAC support.

	Tested-by: Ricardo Neri <[email protected]>
	Signed-off-by: Qiuxu Zhuo <[email protected]>
	Signed-off-by: Tony Luck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit e77086c)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Tony Luck <[email protected]>
commit 8a28b02

New CPU #defines encode vendor and family as well as model.

	Signed-off-by: Tony Luck <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
	Acked-by: Josh Poimboeuf <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit 8a28b02)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Tony Luck <[email protected]>
commit 8fb5f44

New CPU #defines encode vendor and family as well as model.

	Signed-off-by: Tony Luck <[email protected]>
	Signed-off-by: Dave Hansen <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Link: https://lore.kernel.org/all/20240424181504.41634-1-tony.luck%40intel.com
(cherry picked from commit 8fb5f44)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Tony Luck <[email protected]>
commit d32bc21

New CPU #defines encode vendor and family as well as model.

	Signed-off-by: Tony Luck <[email protected]>
	Signed-off-by: Dave Hansen <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Link: https://lore.kernel.org/all/20240424181505.41654-1-tony.luck%40intel.com
(cherry picked from commit d32bc21)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Tony Luck <[email protected]>
commit 9302248

Code in v6.9 arch/x86/kernel/smpboot.c was changed by commit

  4db6427 ("x86/cpu: Switch to new Intel CPU model defines") from:

  static const struct x86_cpu_id intel_cod_cpu[] = {
          X86_MATCH_INTEL_FAM6_MODEL(HASWELL_X, 0),       /* COD */
          X86_MATCH_INTEL_FAM6_MODEL(BROADWELL_X, 0),     /* COD */
          X86_MATCH_INTEL_FAM6_MODEL(ANY, 1),             /* SNC */	<--- 443
          {}
  };

  static bool match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
  {
          const struct x86_cpu_id *id = x86_match_cpu(intel_cod_cpu);

to:

  static const struct x86_cpu_id intel_cod_cpu[] = {
           X86_MATCH_VFM(INTEL_HASWELL_X,   0),    /* COD */
           X86_MATCH_VFM(INTEL_BROADWELL_X, 0),    /* COD */
           X86_MATCH_VFM(INTEL_ANY,         1),    /* SNC */
           {}
   };

  static bool match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
  {
          const struct x86_cpu_id *id = x86_match_cpu(intel_cod_cpu);

On an Intel CPU with SNC enabled this code previously matched the rule on line
443 to avoid printing messages about insane cache configuration.  The new code
did not match any rules.

Expanding the macros for the intel_cod_cpu[] array shows that the old is
equivalent to:

  static const struct x86_cpu_id intel_cod_cpu[] = {
  [0] = { .vendor = 0, .family = 6, .model = 0x3F, .steppings = 0, .feature = 0, .driver_data = 0 },
  [1] = { .vendor = 0, .family = 6, .model = 0x4F, .steppings = 0, .feature = 0, .driver_data = 0 },
  [2] = { .vendor = 0, .family = 6, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 1 },
  [3] = { .vendor = 0, .family = 0, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 0 }
  }

while the new code expands to:

  static const struct x86_cpu_id intel_cod_cpu[] = {
  [0] = { .vendor = 0, .family = 6, .model = 0x3F, .steppings = 0, .feature = 0, .driver_data = 0 },
  [1] = { .vendor = 0, .family = 6, .model = 0x4F, .steppings = 0, .feature = 0, .driver_data = 0 },
  [2] = { .vendor = 0, .family = 0, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 1 },
  [3] = { .vendor = 0, .family = 0, .model = 0x00, .steppings = 0, .feature = 0, .driver_data = 0 }
  }

Looking at the code for x86_match_cpu():

  const struct x86_cpu_id *x86_match_cpu(const struct x86_cpu_id *match)
  {
           const struct x86_cpu_id *m;
           struct cpuinfo_x86 *c = &boot_cpu_data;

           for (m = match;
                m->vendor | m->family | m->model | m->steppings | m->feature;
                m++) {
       		...
           }
           return NULL;

it is clear that there was no match because the ANY entry in the table (array
index 2) is now the loop termination condition (all of vendor, family, model,
steppings, and feature are zero).

So this code was working before because the "ANY" check was looking for any
Intel CPU in family 6. But fails now because the family is a wild card. So the
root cause is that x86_match_cpu() has never been able to match on a rule with
just X86_VENDOR_INTEL and all other fields set to wildcards.

Add a new flags field to struct x86_cpu_id that has a bit set to indicate that
this entry in the array is valid. Update X86_MATCH*() macros to set that bit.
Change the end-marker check in x86_match_cpu() to just check the flags field
for this bit.

Backporter notes: The commit in Fixes is really the one that is broken:
you can't have m->vendor as part of the loop termination conditional in
x86_match_cpu() because it can happen - as it has happened above
- that that whole conditional is 0 albeit vendor == 0 is a valid case
- X86_VENDOR_INTEL is 0.

However, the only case where the above happens is the SNC check added by
4db6427 so you only need this fix if you have backported that
other commit

  4db6427 ("x86/cpu: Switch to new Intel CPU model defines")

Fixes: 644e9cb ("Add driver auto probing for x86 features v4")
	Suggested-by: Thomas Gleixner <[email protected]>
	Suggested-by: Borislav Petkov <[email protected]>
	Signed-off-by: Tony Luck <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
	Cc: <[email protected]> # see above
Link: https://lore.kernel.org/r/20240517144312.GBZkdtAOuJZCvxhFbJ@fat_crate.local
(cherry picked from commit 9302248)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Tony Luck <[email protected]>
commit bc39bfb

New CPU #defines encode vendor and family as well as model.

	Signed-off-by: Tony Luck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit bc39bfb)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Tony Luck <[email protected]>
commit c2c887e

New CPU #defines encode vendor and family as well as model.

	Signed-off-by: Tony Luck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit c2c887e)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Josh Poimboeuf <[email protected]>
commit 42c141f

In cloud environments it can be useful to *only* enable the vmexit
mitigation and leave syscalls vulnerable.  Add that as an option.

This is similar to the old spectre_bhi=auto option which was removed
with the following commit:

  36d4fe1 ("x86/bugs: Remove CONFIG_BHI_MITIGATION_AUTO and spectre_bhi=auto")

with the main difference being that this has a more descriptive name and
is disabled by default.

Mitigation switch requested by Maksim Davydov <[email protected]>.

  [ bp: Massage. ]

	Signed-off-by: Josh Poimboeuf <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
	Reviewed-by: Daniel Sneddon <[email protected]>
	Reviewed-by: Nikolay Borisov <[email protected]>
Link: https://lore.kernel.org/r/2cbad706a6d5e1da2829e5e123d8d5c80330148c.1719381528.git.jpoimboe@kernel.org
(cherry picked from commit 42c141f)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Tony Luck <[email protected]>
commit 691fef8

New CPU #defines encode vendor and family as well as model.

	Signed-off-by: Tony Luck <[email protected]>
	Acked-by: Rafael J. Wysocki <[email protected]>
	Signed-off-by: Rafael J. Wysocki <[email protected]>
(cherry picked from commit 691fef8)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Tony Luck <[email protected]>
commit 5b3eaf1
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.37.1.el9_6/5b3eaf10.failed

New CPU #defines encode vendor and family as well as model.

	Signed-off-by: Tony Luck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Reviewed-by: Ilpo Järvinen <[email protected]>
	Signed-off-by: Ilpo Järvinen <[email protected]>
(cherry picked from commit 5b3eaf1)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	arch/x86/platform/atom/punit_atom_debug.c
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Tony Luck <[email protected]>
commit 490d573

New CPU #defines encode vendor and family as well as model.

	Signed-off-by: Tony Luck <[email protected]>
	Reviewed-by: Jithu Joseph <[email protected]>
	Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
	Acked-by: Hans de Goede <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Reviewed-by: Ilpo Järvinen <[email protected]>
	Signed-off-by: Ilpo Järvinen <[email protected]>
(cherry picked from commit 490d573)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Nícolas F. R. A. Prado <[email protected]>
commit 0e7b7bd
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.37.1.el9_6/0e7b7bde.failed

Move the ksft python module, which provides generic helpers for
kselftests, to a common directory so it can be more easily shared
between different tests.

	Signed-off-by: Nícolas F. R. A. Prado <[email protected]>
	Acked-by: Shuah Khan <[email protected]>
	Acked-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit 0e7b7bd)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	tools/testing/selftests/devices/probe/Makefile
#	tools/testing/selftests/devices/probe/test_discoverable_devices.py
#	tools/testing/selftests/kselftest/ksft.py
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Laura Nao <[email protected]>
commit 170c966
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.37.1.el9_6/170c966c.failed

The Python finished() helper currently exits with KSFT_FAIL when there
are only passed and skipped tests. Fix the logic to exit with KSFT_PASS
instead, making it consistent with its C and bash counterparts
(ksft_finished() and ktap_finished() respectively).

	Reviewed-by: Nícolas F. R. A. Prado <[email protected]>
Fixes: dacf1d7 ("kselftest: Add test to verify probe of devices from discoverable buses")
	Signed-off-by: Laura Nao <[email protected]>
	Reviewed-by: Muhammad Usama Anjum <[email protected]>
	Signed-off-by: Shuah Khan <[email protected]>
(cherry picked from commit 170c966)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	tools/testing/selftests/kselftest/ksft.py
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Namhyung Kim <[email protected]>
commit f6d9883
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.37.1.el9_6/f6d9883f.failed

To pick up changes from:

  149fd47 perf/x86/intel: Support Perfmon MSRs aliasing
  21b362c x86/resctrl: Enable shared RMID mode on Sub-NUMA Cluster (SNC) systems
  4f460bf cpufreq: acpi: move MSR_K7_HWCR_CPB_DIS_BIT into msr-index.h
  7ea8193 x86/cpufeatures: Add HWP highest perf change feature flag
  78ce84b x86/cpufeatures: Flip the /proc/cpuinfo appearance logic
  1beb348 x86/sev: Provide SVSM discovery support

This should be used to beautify x86 syscall arguments and it addresses
these tools/perf build warnings:

  Warning: Kernel ABI header differences:
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

Please see tools/include/uapi/README for details (it's in the first patch
of this series).

	Cc: Thomas Gleixner <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Cc: Borislav Petkov <[email protected]>
	Cc: Dave Hansen <[email protected]>
	Cc: "H. Peter Anvin" <[email protected]>
	Cc: [email protected]
	Signed-off-by: Namhyung Kim <[email protected]>
(cherry picked from commit f6d9883)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	tools/arch/x86/include/asm/cpufeatures.h
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Mario Limonciello <[email protected]>
commit 104edc6
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.37.1.el9_6/104edc6e.failed

This feature is an AMD unique feature of some processors, so put
AMD into the name.

	Signed-off-by: Mario Limonciello <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit 104edc6)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	arch/x86/include/asm/cpufeatures.h
#	tools/arch/x86/include/asm/cpufeatures.h
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Perry Yuan <[email protected]>
commit 1ad4667
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.37.1.el9_6/1ad46670.failed

CPUID leaf 0x80000026 advertises core types with different efficiency
rankings.

Bit 30 indicates the heterogeneous core topology feature, if the bit
set, it means not all instances at the current hierarchical level have
the same core topology.

This is described in the AMD64 Architecture Programmers Manual Volume
2 and 3, doc ID #25493 and #25494.

	Signed-off-by: Perry Yuan <[email protected]>
Co-developed-by: Mario Limonciello <[email protected]>
	Signed-off-by: Mario Limonciello <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit 1ad4667)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	arch/x86/include/asm/cpufeatures.h
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Pawan Gupta <[email protected]>
commit 45239ba

Sometimes it is required to take actions based on if a CPU is a performance or
efficiency core. As an example, intel_pstate driver uses the Intel core-type
to determine CPU scaling. Also, some CPU vulnerabilities only affect
a specific CPU type, like RFDS only affects Intel Atom. Hybrid systems that
have variants P+E, P-only(Core) and E-only(Atom), it is not straightforward to
identify which variant is affected by a type specific vulnerability.

Such processors do have CPUID field that can uniquely identify them. Like,
P+E, P-only and E-only enumerates CPUID.1A.CORE_TYPE identification, while P+E
additionally enumerates CPUID.7.HYBRID. Based on this information, it is
possible for boot CPU to identify if a system has mixed CPU types.

Add a new field hw_cpu_type to struct cpuinfo_topology that stores the
hardware specific CPU type. This saves the overhead of IPIs to get the CPU
type of a different CPU. CPU type is populated early in the boot process,
before vulnerabilities are enumerated.

	Signed-off-by: Pawan Gupta <[email protected]>
Co-developed-by: Mario Limonciello <[email protected]>
	Signed-off-by: Mario Limonciello <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
	Acked-by: Dave Hansen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit 45239ba)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Dave Hansen <[email protected]>
commit 85b0818
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.37.1.el9_6/85b08180.failed

The x86_match_cpu() infrastructure can match CPU steppings. Since
there are only 16 possible steppings, the matching infrastructure goes
all out and stores the stepping match as a bitmap. That means it can
match any possible steppings in a single list entry. Fun.

But it exposes this bitmap to each of the X86_MATCH_*() helpers when
none of them really need a bitmap. It makes up for this by exporting a
helper (X86_STEPPINGS()) which converts a contiguous stepping range
into the bitmap which every single user leverages.

Instead of a bitmap, have the main helper for this sort of thing
(X86_MATCH_VFM_STEPS()) just take a stepping range. This ends up
actually being even more compact than before.

Leave the helper in place (renamed to __X86_STEPPINGS()) to make it
more clear what is going on instead of just having a random GENMASK()
in the middle of an already complicated macro.

One oddity that I hit was this macro:

       X86_MATCH_VFM_STEPS(vfm, X86_STEPPING_MIN, max_stepping, issues)

It *could* have been converted over to take a min/max stepping value
for each entry. But that would have been a bit too verbose and would
prevent the one oddball in the list (INTEL_COMETLAKE_L stepping 0)
from sticking out.

Instead, just have it take a *maximum* stepping and imply that the match
is from 0=>max_stepping. This is functional for all the cases now and
also retains the nice property of having INTEL_COMETLAKE_L stepping 0
stick out like a sore thumb.

skx_cpuids[] is goofy. It uses the stepping match but encodes all
possible steppings. Just use a normal, non-stepping match helper.

	Suggested-by: Ingo Molnar <[email protected]>
	Signed-off-by: Dave Hansen <[email protected]>
Link: https://lore.kernel.org/all/20241213185129.65527B2A%40davehans-spike.ostc.intel.com
(cherry picked from commit 85b0818)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	arch/x86/kernel/cpu/common.c
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Raag Jadav <[email protected]>
commit 3560a02

Fix typo in x86_match_cpu()'s description.

  [ bp: Massage commit message. ]

	Signed-off-by: Raag Jadav <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit 3560a02)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Laura Nao <[email protected]>
commit 7486440
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.37.1.el9_6/74864403.failed

Update the functions that print the test totals at the end of a selftest
to include a warning message when skipped tests are detected. The
message advises users that skipped tests may indicate missing
configuration options and suggests enabling them to improve coverage.

Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Laura Nao <[email protected]>
	Signed-off-by: Shuah Khan <[email protected]>
(cherry picked from commit 7486440)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	tools/testing/selftests/kselftest/ksft.py
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Pawan Gupta <[email protected]>
commit 7b9b54e

The comments needs to reflect an implementation change.

No functional change.

	Signed-off-by: Pawan Gupta <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
	Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit 7b9b54e)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Pawan Gupta <[email protected]>
commit c339040
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.37.1.el9_6/c3390406.failed

To add cpu-type to the existing CPU matching infrastructure, the base macro
X86_MATCH_VENDOR_FAM_MODEL_STEPPINGS_FEATURE need to append _CPU_TYPE. This
makes an already long name longer, and somewhat incomprehensible.

To avoid this, rename the base macro to X86_MATCH_CPU. The macro name
doesn't need to explicitly tell everything that it matches. The arguments
to the macro already hint at that.

For consistency, use this base macro to define X86_MATCH_VFM and friends.

Remove unused X86_MATCH_VENDOR_FAM_MODEL_FEATURE while at it.

  [ bp: Massage commit message. ]

	Signed-off-by: Pawan Gupta <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
	Signed-off-by: Ingo Molnar <[email protected]>
	Acked-by: Dave Hansen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit c339040)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	arch/x86/include/asm/cpu_device_id.h
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.37.1.el9_6
commit-author Pawan Gupta <[email protected]>
commit 00d7fc0
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.37.1.el9_6/00d7fc04.failed

In addition to matching vendor/family/model/feature, for hybrid variants it is
required to also match cpu-type. For example, some CPU vulnerabilities like
RFDS only affect a specific cpu-type.

To be able to also match CPUs based on their type, add a new field "type" to
struct x86_cpu_id which is used by the CPU-matching tables. Introduce
X86_CPU_TYPE_ANY for the cases that don't care about the cpu-type.

  [ bp: Massage commit message. ]

	Signed-off-by: Pawan Gupta <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
	Signed-off-by: Ingo Molnar <[email protected]>
	Acked-by: Dave Hansen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit 00d7fc0)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	arch/x86/include/asm/cpu_device_id.h
jira LE-4018
cve CVE-2025-22077
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Kuniyuki Iwashima <[email protected]>
commit 95d2b9f
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.39.1.el9_6/95d2b9f6.failed

This reverts commit e9f2517.

Commit e9f2517 ("smb: client: fix TCP timers deadlock after
rmmod") is intended to fix a null-ptr-deref in LOCKDEP, which is
mentioned as CVE-2024-54680, but is actually did not fix anything;
The issue can be reproduced on top of it. [0]

Also, it reverted the change by commit ef7134c ("smb: client:
Fix use-after-free of network namespace.") and introduced a real
issue by reviving the kernel TCP socket.

When a reconnect happens for a CIFS connection, the socket state
transitions to FIN_WAIT_1.  Then, inet_csk_clear_xmit_timers_sync()
in tcp_close() stops all timers for the socket.

If an incoming FIN packet is lost, the socket will stay at FIN_WAIT_1
forever, and such sockets could be leaked up to net.ipv4.tcp_max_orphans.

Usually, FIN can be retransmitted by the peer, but if the peer aborts
the connection, the issue comes into reality.

I warned about this privately by pointing out the exact report [1],
but the bogus fix was finally merged.

So, we should not stop the timers to finally kill the connection on
our side in that case, meaning we must not use a kernel socket for
TCP whose sk->sk_net_refcnt is 0.

The kernel socket does not have a reference to its netns to make it
possible to tear down netns without cleaning up every resource in it.

For example, tunnel devices use a UDP socket internally, but we can
destroy netns without removing such devices and let it complete
during exit.  Otherwise, netns would be leaked when the last application
died.

However, this is problematic for TCP sockets because TCP has timers to
close the connection gracefully even after the socket is close()d.  The
lifetime of the socket and its netns is different from the lifetime of
the underlying connection.

If the socket user does not maintain the netns lifetime, the timer could
be fired after the socket is close()d and its netns is freed up, resulting
in use-after-free.

Actually, we have seen so many similar issues and converted such sockets
to have a reference to netns.

That's why I converted the CIFS client socket to have a reference to
netns (sk->sk_net_refcnt == 1), which is somehow mentioned as out-of-scope
of CIFS and technically wrong in e9f2517, but **is in-scope and right
fix**.

Regarding the LOCKDEP issue, we can prevent the module unload by
bumping the module refcount when switching the LOCKDDEP key in
sock_lock_init_class_and_name(). [2]

For a while, let's revert the bogus fix.

Note that now we can use sk_net_refcnt_upgrade() for the socket
conversion, but I'll do so later separately to make backport easy.

Link: https://lore.kernel.org/all/[email protected]/ #[0]
Link: https://lore.kernel.org/netdev/[email protected]/ #[1]
Link: https://lore.kernel.org/lkml/[email protected]/ #[2]
Fixes: e9f2517 ("smb: client: fix TCP timers deadlock after rmmod")
	Signed-off-by: Kuniyuki Iwashima <[email protected]>
	Cc: [email protected]
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 95d2b9f)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	fs/smb/client/connect.c
jira LE-4018
cve CVE-2025-38464
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Kuniyuki Iwashima <[email protected]>
commit 667eeab

syzbot reported a null-ptr-deref in tipc_conn_close() during netns
dismantle. [0]

tipc_topsrv_stop() iterates tipc_net(net)->topsrv->conn_idr and calls
tipc_conn_close() for each tipc_conn.

The problem is that tipc_conn_close() is called after releasing the
IDR lock.

At the same time, there might be tipc_conn_recv_work() running and it
could call tipc_conn_close() for the same tipc_conn and release its
last ->kref.

Once we release the IDR lock in tipc_topsrv_stop(), there is no
guarantee that the tipc_conn is alive.

Let's hold the ref before releasing the lock and put the ref after
tipc_conn_close() in tipc_topsrv_stop().

[0]:
BUG: KASAN: use-after-free in tipc_conn_close+0x122/0x140 net/tipc/topsrv.c:165
Read of size 8 at addr ffff888099305a08 by task kworker/u4:3/435

CPU: 0 PID: 435 Comm: kworker/u4:3 Not tainted 4.19.204-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1fc/0x2ef lib/dump_stack.c:118
 print_address_description.cold+0x54/0x219 mm/kasan/report.c:256
 kasan_report_error.cold+0x8a/0x1b9 mm/kasan/report.c:354
 kasan_report mm/kasan/report.c:412 [inline]
 __asan_report_load8_noabort+0x88/0x90 mm/kasan/report.c:433
 tipc_conn_close+0x122/0x140 net/tipc/topsrv.c:165
 tipc_topsrv_stop net/tipc/topsrv.c:701 [inline]
 tipc_topsrv_exit_net+0x27b/0x5c0 net/tipc/topsrv.c:722
 ops_exit_list+0xa5/0x150 net/core/net_namespace.c:153
 cleanup_net+0x3b4/0x8b0 net/core/net_namespace.c:553
 process_one_work+0x864/0x1570 kernel/workqueue.c:2153
 worker_thread+0x64c/0x1130 kernel/workqueue.c:2296
 kthread+0x33f/0x460 kernel/kthread.c:259
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415

Allocated by task 23:
 kmem_cache_alloc_trace+0x12f/0x380 mm/slab.c:3625
 kmalloc include/linux/slab.h:515 [inline]
 kzalloc include/linux/slab.h:709 [inline]
 tipc_conn_alloc+0x43/0x4f0 net/tipc/topsrv.c:192
 tipc_topsrv_accept+0x1b5/0x280 net/tipc/topsrv.c:470
 process_one_work+0x864/0x1570 kernel/workqueue.c:2153
 worker_thread+0x64c/0x1130 kernel/workqueue.c:2296
 kthread+0x33f/0x460 kernel/kthread.c:259
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415

Freed by task 23:
 __cache_free mm/slab.c:3503 [inline]
 kfree+0xcc/0x210 mm/slab.c:3822
 tipc_conn_kref_release net/tipc/topsrv.c:150 [inline]
 kref_put include/linux/kref.h:70 [inline]
 conn_put+0x2cd/0x3a0 net/tipc/topsrv.c:155
 process_one_work+0x864/0x1570 kernel/workqueue.c:2153
 worker_thread+0x64c/0x1130 kernel/workqueue.c:2296
 kthread+0x33f/0x460 kernel/kthread.c:259
 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415

The buggy address belongs to the object at ffff888099305a00
 which belongs to the cache kmalloc-512 of size 512
The buggy address is located 8 bytes inside of
 512-byte region [ffff888099305a00, ffff888099305c00)
The buggy address belongs to the page:
page:ffffea000264c140 count:1 mapcount:0 mapping:ffff88813bff0940 index:0x0
flags: 0xfff00000000100(slab)
raw: 00fff00000000100 ffffea00028b6b88 ffffea0002cd2b08 ffff88813bff0940
raw: 0000000000000000 ffff888099305000 0000000100000006 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888099305900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888099305980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888099305a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                      ^
 ffff888099305a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff888099305b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: c5fa7b3 ("tipc: introduce new TIPC server infrastructure")
	Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=27169a847a70550d17be
	Signed-off-by: Kuniyuki Iwashima <[email protected]>
	Reviewed-by: Tung Nguyen <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 667eeab)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Niklas Schnelle <[email protected]>
commit dc287e4

Since commit 25f39d3 ("s390/pci: Ignore RID for isolated VFs") PFs
which are not initially configured but in standby are considered
isolated. That is they create only a single function PCI domain. Due to
the PCI domains being created on discovery, this means that even if they
are configured later on, sibling PFs and their child VFs will not be
added to their PCI domain breaking SR-IOV expectations.

The reason the referenced commit ignored standby PFs for the creation of
multi-function PCI subhierarchies, was to work around a PCI domain
renumbering scenario on reboot. The renumbering would occur after
removing a previously in standby PF, whose domain number is used for its
configured sibling PFs and their child VFs, but which itself remained in
standby. When this is followed by a reboot, the sibling PF is used
instead to determine the PCI domain number of it and its child VFs.

In principle it is not possible to know which standby PFs will be
configured later and which may be removed. The PCI domain and root bus
are pre-requisites for hotplug slots so the decision of which functions
belong to which domain can not be postponed. With the renumbering
occurring only in rare circumstances and being generally benign, accept
it as an oddity and fix SR-IOV for initially standby PFs simply by
allowing them to create PCI domains.

	Cc: [email protected]
	Reviewed-by: Gerd Bayer <[email protected]>
Fixes: 25f39d3 ("s390/pci: Ignore RID for isolated VFs")
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Alexander Gordeev <[email protected]>
(cherry picked from commit dc287e4)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Niklas Schnelle <[email protected]>
commit 0579388

This creates a new zpci_iov_find_parent_pf() function which a future
commit can use to find if a VF has a configured parent PF. Use
zdev->rid instead of zdev->devfn such that the new function can be used
before it has been decided if the RID will be exposed and zdev->devfn is
set. Also handle the hypotheical case that the RID is not available but
there is an otherwise matching zbus.

Fixes: 25f39d3 ("s390/pci: Ignore RID for isolated VFs")
	Cc: [email protected]
	Reviewed-by: Halil Pasic <[email protected]>
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Vasily Gorbik <[email protected]>
(cherry picked from commit 0579388)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Niklas Schnelle <[email protected]>
commit 2844ddb

In contrast to the commit message of the fixed commit VFs whose parent
PF is not configured are not always isolated, that is put on their own
PCI domain. This is because for VFs to be added to an existing PCI
domain it is enough for that PCI domain to share the same topology ID or
PCHID. Such a matching PCI domain without a parent PF may exist when
a PF from the same PCI card created the domain with the VF being a child
of a different, non accessible, PF. While not causing technical issues
it makes the rules which VFs are isolated inconsistent.

Fix this by explicitly checking that the parent PF exists on the PCI
domain determined by the topology ID or PCHID before registering the VF.
This works because a parent PF which is under control of this Linux
instance must be enabled and configured at the point where its child VFs
appear because otherwise SR-IOV could not have been enabled on the
parent.

Fixes: 25f39d3 ("s390/pci: Ignore RID for isolated VFs")
	Cc: [email protected]
	Reviewed-by: Halil Pasic <[email protected]>
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Vasily Gorbik <[email protected]>
(cherry picked from commit 2844ddb)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Niklas Schnelle <[email protected]>
commit 8691abd

For non-VFs, zpci_bus_is_isolated_vf() should return false because they
aren't VFs. While zpci_iov_find_parent_pf() specifically checks if
a function is a VF, it then simply returns that there is no parent. The
simplistic check for a parent then leads to these functions being
confused with isolated VFs and isolating them on their own domain even
if sibling PFs should share the domain.

Fix this by explicitly checking if a function is not a VF. Note also
that at this point the case where RIDs are ignored is already handled
and in this case all PCI functions get isolated by being detected in
zpci_bus_is_multifunction_root().

	Cc: [email protected]
Fixes: 2844ddb ("s390/pci: Fix handling of isolated VFs")
	Signed-off-by: Niklas Schnelle <[email protected]>
	Reviewed-by: Halil Pasic <[email protected]>
	Signed-off-by: Vasily Gorbik <[email protected]>
(cherry picked from commit 8691abd)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Dennis Chen <[email protected]>
commit 50b2af4

Currently the tx_dropped field in VF stats is not updated correctly
when reading stats from the PF. This is because it reads from
i40e_eth_stats.tx_discards which seems to be unused for per VSI stats,
as it is not updated by i40e_update_eth_stats() and the corresponding
register, GLV_TDPC, is not implemented[1].

Use i40e_eth_stats.tx_errors instead, which is actually updated by
i40e_update_eth_stats() by reading from GLV_TEPC.

To test, create a VF and try to send bad packets through it:

$ echo 1 > /sys/class/net/enp2s0f0/device/sriov_numvfs
$ cat test.py
from scapy.all import *

vlan_pkt = Ether(dst="ff:ff:ff:ff:ff:ff") / Dot1Q(vlan=999) / IP(dst="192.168.0.1") / ICMP()
ttl_pkt = IP(dst="8.8.8.8", ttl=0) / ICMP()

print("Send packet with bad VLAN tag")
sendp(vlan_pkt, iface="enp2s0f0v0")
print("Send packet with TTL=0")
sendp(ttl_pkt, iface="enp2s0f0v0")
$ ip -s link show dev enp2s0f0
16: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 3c:ec:ef:b7:e0:ac brd ff:ff:ff:ff:ff:ff
    RX:  bytes packets errors dropped  missed   mcast
             0       0      0       0       0       0
    TX:  bytes packets errors dropped carrier collsns
             0       0      0       0       0       0
    vf 0     link/ether e2:c6:fd:c1:1e:92 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    RX: bytes  packets  mcast   bcast   dropped
             0        0       0       0        0
    TX: bytes  packets   dropped
             0        0        0
$ python test.py
Send packet with bad VLAN tag
.
Sent 1 packets.
Send packet with TTL=0
.
Sent 1 packets.
$ ip -s link show dev enp2s0f0
16: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 3c:ec:ef:b7:e0:ac brd ff:ff:ff:ff:ff:ff
    RX:  bytes packets errors dropped  missed   mcast
             0       0      0       0       0       0
    TX:  bytes packets errors dropped carrier collsns
             0       0      0       0       0       0
    vf 0     link/ether e2:c6:fd:c1:1e:92 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    RX: bytes  packets  mcast   bcast   dropped
             0        0       0       0        0
    TX: bytes  packets   dropped
             0        0        0

A packet with non-existent VLAN tag and a packet with TTL = 0 are sent,
but tx_dropped is not incremented.

After patch:

$ ip -s link show dev enp2s0f0
19: enp2s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 3c:ec:ef:b7:e0:ac brd ff:ff:ff:ff:ff:ff
    RX:  bytes packets errors dropped  missed   mcast
             0       0      0       0       0       0
    TX:  bytes packets errors dropped carrier collsns
             0       0      0       0       0       0
    vf 0     link/ether 4a:b7:3d:37:f7:56 brd ff:ff:ff:ff:ff:ff, spoof checking on, link-state auto, trust off
    RX: bytes  packets  mcast   bcast   dropped
             0        0       0       0        0
    TX: bytes  packets   dropped
             0        0        2

Fixes: dc645da ("i40e: implement VF stats NDO")
	Signed-off-by: Dennis Chen <[email protected]>
Link: https://www.intel.com/content/www/us/en/content-details/596333/intel-ethernet-controller-x710-tm4-at2-carlsville-datasheet.html
	Reviewed-by: Simon Horman <[email protected]>
	Tested-by: Rafal Romanowski <[email protected]>
	Signed-off-by: Tony Nguyen <[email protected]>
(cherry picked from commit 50b2af4)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Victor Nogueira <[email protected]>
commit ffdde7b

Lion's patch [1] revealed an ancient bug in the qdisc API.
Whenever a user creates/modifies a qdisc specifying as a parent another
qdisc, the qdisc API will, during grafting, detect that the user is
not trying to attach to a class and reject. However grafting is
performed after qdisc_create (and thus the qdiscs' init callback) is
executed. In qdiscs that eventually call qdisc_tree_reduce_backlog
during init or change (such as fq, hhf, choke, etc), an issue
arises. For example, executing the following commands:

sudo tc qdisc add dev lo root handle a: htb default 2
sudo tc qdisc add dev lo parent a: handle beef fq

Qdiscs such as fq, hhf, choke, etc unconditionally invoke
qdisc_tree_reduce_backlog() in their control path init() or change() which
then causes a failure to find the child class; however, that does not stop
the unconditional invocation of the assumed child qdisc's qlen_notify with
a null class. All these qdiscs make the assumption that class is non-null.

The solution is ensure that qdisc_leaf() which looks up the parent
class, and is invoked prior to qdisc_create(), should return failure on
not finding the class.
In this patch, we leverage qdisc_leaf to return ERR_PTRs whenever the
parentid doesn't correspond to a class, so that we can detect it
earlier on and abort before qdisc_create is called.

[1] https://lore.kernel.org/netdev/[email protected]/

Fixes: 5e50da0 ("[NET_SCHED]: Fix endless loops (part 2): "simple" qdiscs")
	Reported-by: [email protected]
Closes: https://lore.kernel.org/netdev/[email protected]/
	Reported-by: [email protected]
Closes: https://lore.kernel.org/netdev/[email protected]/
	Reported-by: [email protected]
Closes: https://lore.kernel.org/netdev/[email protected]/
	Reported-by: [email protected]
Closes: https://lore.kernel.org/netdev/[email protected]/
	Reported-by: [email protected]
Closes: https://lore.kernel.org/netdev/[email protected]/
	Acked-by: Jamal Hadi Salim <[email protected]>
	Reviewed-by: Cong Wang <[email protected]>
	Signed-off-by: Victor Nogueira <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit ffdde7b)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Gerd Bayer <[email protected]>
commit 0d48566
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.39.1.el9_6/0d48566d.failed

Since this guards only the Function Measurement Block, rename from
generic lock to fmb_lock in preparation to introduce another lock
that guards the state member

	Signed-off-by: Gerd Bayer <[email protected]>
	Reviewed-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit 0d48566)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	arch/s390/pci/pci.c
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Gerd Bayer <[email protected]>
commit bcb5d6c
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.39.1.el9_6/bcb5d6c7.failed

There's a number of tasks that need the state of a zpci device
to be stable. Other tasks need to be synchronized as they change the state.

State changes could be generated by the system as availability or error
events, or be requested by the user through manipulations in sysfs.
Some other actions accessible through sysfs - like device resets - need the
state to be stable.

Unsynchronized state handling could lead to unusable devices. This has
been observed in cases of concurrent state changes through systemd udev
rules and DPM boot control. Some breakage can be provoked by artificial
tests, e.g. through repetitively injecting "recover" on a PCI function
through sysfs while running a "hotplug remove/add" in a loop through a
PCI slot's "power" attribute in sysfs. After a few iterations this could
result in a kernel oops.

So introduce a new mutex "state_lock" to guard the state property of the
struct zpci_dev. Acquire this lock in all task that modify the state:

- hotplug add and remove, through the PCI hotplug slot entry,
- avaiability events, as reported by the platform,
- error events, as reported by the platform,
- during device resets, explicit through sysfs requests or
  implict through the common PCI layer.

Break out an inner _do_recover() routine out of recover_store() to
separte the necessary synchronizations from the actual manipulations of
the zpci_dev required for the reset.

With the following changes I was able to run the inject loops for hours
without hitting an error.

	Signed-off-by: Gerd Bayer <[email protected]>
	Reviewed-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit bcb5d6c)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	arch/s390/pci/pci.c
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Gerd Bayer <[email protected]>
commit 6ee600b

Centralize the removal so all paths are covered and the hotplug slot
will remain active until the device is really destroyed.

	Signed-off-by: Gerd Bayer <[email protected]>
	Reviewed-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit 6ee600b)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
cve CVE-2024-56699
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Niklas Schnelle <[email protected]>
commit c4a585e

In commit 6ee600b ("s390/pci: remove hotplug slot when releasing the
device") the zpci_exit_slot() was moved from zpci_device_reserved() to
zpci_release_device() with the intention of keeping the hotplug slot
around until the device is actually removed.

Now zpci_release_device() is only called once all references are
dropped. Since the zPCI subsystem only drops its reference once the
device is in the reserved state it follows that zpci_release_device()
must only deal with devices in the reserved state. Despite that it
contains code to tear down from both configured and standby state. For
the standby case this already includes the removal of the hotplug slot
so would cause a double removal if a device was ever removed in
either configured or standby state.

Instead of causing a potential double removal in a case that should
never happen explicitly WARN_ON() if a device in non-reserved state is
released and get rid of the dead code cases.

Fixes: 6ee600b ("s390/pci: remove hotplug slot when releasing the device")
	Reviewed-by: Matthew Rosato <[email protected]>
	Reviewed-by: Gerd Bayer <[email protected]>
	Tested-by: Gerd Bayer <[email protected]>
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit c4a585e)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
cve CVE-2025-37974
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Niklas Schnelle <[email protected]>
commit 42420c5

The zpci_create_device() function returns an error pointer that needs to
be checked before dereferencing it as a struct zpci_dev pointer. Add the
missing check in __clp_add() where it was missed when adding the
scan_list in the fixed commit. Simply not adding the device to the scan
list results in the previous behavior.

	Cc: [email protected]
Fixes: 0467cdd ("s390/pci: Sort PCI functions prior to creating virtual busses")
	Signed-off-by: Niklas Schnelle <[email protected]>
	Reviewed-by: Gerd Bayer <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit 42420c5)
	Signed-off-by: Jonathan Maple <[email protected]>
…hild VFs

jira LE-4018
cve CVE-2025-37946
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Niklas Schnelle <[email protected]>
commit 05a2538
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.39.1.el9_6/05a2538f.failed

With commit bcb5d6c ("s390/pci: introduce lock to synchronize state
of zpci_dev's") the code to ignore power off of a PF that has child VFs
was changed from a direct return to a goto to the unlock and
pci_dev_put() section. The change however left the existing pci_dev_put()
untouched resulting in a doubple put. This can subsequently cause a use
after free if the struct pci_dev is released in an unexpected state.
Fix this by removing the extra pci_dev_put().

	Cc: [email protected]
Fixes: bcb5d6c ("s390/pci: introduce lock to synchronize state of zpci_dev's")
	Signed-off-by: Niklas Schnelle <[email protected]>
	Reviewed-by: Gerd Bayer <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit 05a2538)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	drivers/pci/hotplug/s390_pci_hpc.c
…device()

jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Niklas Schnelle <[email protected]>
commit d76f963

Remove zpci_bus_remove_device() and zpci_disable_device() calls from
zpci_release_device(). These calls were done when the device
transitioned into the ZPCI_FN_STATE_STANDBY state which is guaranteed to
happen before it enters the ZPCI_FN_STATE_RESERVED state. When
zpci_release_device() is called the device is known to be in the
ZPCI_FN_STATE_RESERVED state which is also checked by a WARN_ON().

	Cc: [email protected]
Fixes: a46044a ("s390/pci: fix zpci_zdev_put() on reserve")
	Reviewed-by: Gerd Bayer <[email protected]>
	Reviewed-by: Julian Ruess <[email protected]>
	Tested-by: Gerd Bayer <[email protected]>
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit d76f963)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Niklas Schnelle <[email protected]>
commit 47c3978
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-570.39.1.el9_6/47c39784.failed

As disable_slot() takes a struct zpci_dev from the Configured to the
Standby state. In Standby there is still a hotplug slot so this is not
usually a case of sysfs self deletion. This is important because self
deletion gets very hairy in terms of locking (see for example
recover_store() in arch/s390/pci/pci_sysfs.c).

Because the pci_dev_put() is not within the critical section of the
zdev->state_lock however, disable_slot() can turn into a case of self
deletion if zPCI device event handling slips between the mutex_unlock()
and the pci_dev_put(). If the latter is the last put and
zpci_release_device() is called this then tries to remove the hotplug
slot via zpci_exit_slot() which will try to remove the hotplug slot
directory the disable_slot() is part of i.e. self deletion.

Prevent this by widening the zdev->state_lock critical section to
include the pci_dev_put() which is then guaranteed to happen with the
struct zpci_dev still in Standby state ensuring it will not lead to
a zpci_release_device() call as at least the zPCI event handling code
still holds a reference.

	Cc: [email protected]
Fixes: a46044a ("s390/pci: fix zpci_zdev_put() on reserve")
	Reviewed-by: Gerd Bayer <[email protected]>
	Tested-by: Gerd Bayer <[email protected]>
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit 47c3978)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	drivers/pci/hotplug/s390_pci_hpc.c
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Niklas Schnelle <[email protected]>
commit 4b1815a

The architecture assumes that PCI functions can be removed synchronously
as PCI events are processed. This however clashes with the reference
counting of struct pci_dev which allows device drivers to hold on to a
struct pci_dev reference even as the underlying device is removed. To
bridge this gap commit 2a671f7 ("s390/pci: fix use after free of
zpci_dev") keeps the struct zpci_dev in ZPCI_FN_STATE_RESERVED state
until common code releases the struct pci_dev. Only when all references
are dropped, the struct zpci_dev can be removed and freed.

Later commit a46044a ("s390/pci: fix zpci_zdev_put() on reserve")
moved the deletion of the struct zpci_dev from the zpci_list in
zpci_release_device() to the point where the device is reserved. This
was done to prevent handling events for a device that is already being
removed, e.g. when the platform generates both PCI event codes 0x304
and 0x308. In retrospect, deletion from the zpci_list in the release
function without holding the zpci_list_lock was also racy.

A side effect of this handling is that if the underlying device
re-appears while the struct zpci_dev is in the ZPCI_FN_STATE_RESERVED
state, the new and old instances of the struct zpci_dev and/or struct
pci_dev may clash. For example when trying to create the IOMMU sysfs
files for the new instance. In this case, re-adding the new instance is
aborted. The old instance is removed, and the device will remain absent
until the platform issues another event.

Fix this by allowing the struct zpci_dev to be brought back up right
until it is finally removed. To this end also keep the struct zpci_dev
in the zpci_list until it is finally released when all references have
been dropped.

Deletion from the zpci_list from within the release function is made
safe by using kref_put_lock() with the zpci_list_lock. This ensures that
the releasing code holds the last reference.

	Cc: [email protected]
Fixes: a46044a ("s390/pci: fix zpci_zdev_put() on reserve")
	Reviewed-by: Gerd Bayer <[email protected]>
	Tested-by: Gerd Bayer <[email protected]>
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit 4b1815a)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-4018
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Niklas Schnelle <[email protected]>
commit 774a1fa

Prior changes ensured that when zpci_release_device() is called and it
removed the zdev from the zpci_list this instance can not be found via
the zpci_list anymore even while allowing re-add of reserved devices.
This only accounts for the overall lifetime and zpci_list addition and
removal, it does not yet prevent concurrent add of a new instance for
the same underlying device. Such concurrent add would subsequently cause
issues such as attempted re-use of the same IOMMU sysfs directory and is
generally undesired.

Introduce a new zpci_add_remove_lock mutex to serialize adding a new
device with removal. Together this ensures that if a struct zpci_dev is
not found in the zpci_list it was either already removed and torn down,
or its removal and tear down is in progress with the
zpci_add_remove_lock held.

	Cc: [email protected]
Fixes: a46044a ("s390/pci: fix zpci_zdev_put() on reserve")
	Reviewed-by: Gerd Bayer <[email protected]>
	Tested-by: Gerd Bayer <[email protected]>
	Signed-off-by: Niklas Schnelle <[email protected]>
	Signed-off-by: Heiko Carstens <[email protected]>
(cherry picked from commit 774a1fa)
	Signed-off-by: Jonathan Maple <[email protected]>
…terface

jira LE-4018
cve CVE-2025-38500
Rebuild_History Non-Buildable kernel-5.14.0-570.39.1.el9_6
commit-author Eyal Birger <[email protected]>
commit a90b2a1

collect_md property on xfrm interfaces can only be set on device creation,
thus xfrmi_changelink() should fail when called on such interfaces.

The check to enforce this was done only in the case where the xi was
returned from xfrmi_locate() which doesn't look for the collect_md
interface, and thus the validation was never reached.

Calling changelink would thus errornously place the special interface xi
in the xfrmi_net->xfrmi hash, but since it also exists in the
xfrmi_net->collect_md_xfrmi pointer it would lead to a double free when
the net namespace was taken down [1].

Change the check to use the xi from netdev_priv which is available earlier
in the function to prevent changes in xfrm collect_md interfaces.

[1] resulting oops:
[    8.516540] kernel BUG at net/core/dev.c:12029!
[    8.516552] Oops: invalid opcode: 0000 [#1] SMP NOPTI
[    8.516559] CPU: 0 UID: 0 PID: 12 Comm: kworker/u80:0 Not tainted 6.15.0-virtme #5 PREEMPT(voluntary)
[    8.516565] Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[    8.516569] Workqueue: netns cleanup_net
[    8.516579] RIP: 0010:unregister_netdevice_many_notify+0x101/0xab0
[    8.516590] Code: 90 0f 0b 90 48 8b b0 78 01 00 00 48 8b 90 80 01 00 00 48 89 56 08 48 89 32 4c 89 80 78 01 00 00 48 89 b8 80 01 00 00 eb ac 90 <0f> 0b 48 8b 45 00 4c 8d a0 88 fe ff ff 48 39 c5 74 5c 41 80 bc 24
[    8.516593] RSP: 0018:ffffa93b8006bd30 EFLAGS: 00010206
[    8.516598] RAX: ffff98fe4226e000 RBX: ffffa93b8006bd58 RCX: ffffa93b8006bc60
[    8.516601] RDX: 0000000000000004 RSI: 0000000000000000 RDI: dead000000000122
[    8.516603] RBP: ffffa93b8006bdd8 R08: dead000000000100 R09: ffff98fe4133c100
[    8.516605] R10: 0000000000000000 R11: 00000000000003d2 R12: ffffa93b8006be00
[    8.516608] R13: ffffffff96c1a510 R14: ffffffff96c1a510 R15: ffffa93b8006be00
[    8.516615] FS:  0000000000000000(0000) GS:ffff98fee73b7000(0000) knlGS:0000000000000000
[    8.516619] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    8.516622] CR2: 00007fcd2abd0700 CR3: 000000003aa40000 CR4: 0000000000752ef0
[    8.516625] PKRU: 55555554
[    8.516627] Call Trace:
[    8.516632]  <TASK>
[    8.516635]  ? rtnl_is_locked+0x15/0x20
[    8.516641]  ? unregister_netdevice_queue+0x29/0xf0
[    8.516650]  ops_undo_list+0x1f2/0x220
[    8.516659]  cleanup_net+0x1ad/0x2e0
[    8.516664]  process_one_work+0x160/0x380
[    8.516673]  worker_thread+0x2aa/0x3c0
[    8.516679]  ? __pfx_worker_thread+0x10/0x10
[    8.516686]  kthread+0xfb/0x200
[    8.516690]  ? __pfx_kthread+0x10/0x10
[    8.516693]  ? __pfx_kthread+0x10/0x10
[    8.516697]  ret_from_fork+0x82/0xf0
[    8.516705]  ? __pfx_kthread+0x10/0x10
[    8.516709]  ret_from_fork_asm+0x1a/0x30
[    8.516718]  </TASK>

Fixes: abc340b ("xfrm: interface: support collect metadata mode")
	Reported-by: Lonial Con <[email protected]>
	Signed-off-by: Eyal Birger <[email protected]>
	Signed-off-by: Steffen Klassert <[email protected]>
(cherry picked from commit a90b2a1)
	Signed-off-by: Jonathan Maple <[email protected]>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..kernel-mainline: 323388
Number of commits in rpm: 37
Number of commits matched with upstream: 34 (91.89%)
Number of commits in upstream but not in rpm: 323354
Number of commits NOT found in upstream: 3 (8.11%)

Rebuilding Kernel on Branch rocky9_6_rebuild_kernel-5.14.0-570.39.1.el9_6 for kernel-5.14.0-570.39.1.el9_6
Clean Cherry Picks: 23 (67.65%)
Empty Cherry Picks: 5 (14.71%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-5.14.0-570.39.1.el9_6/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

Copy link

@jdieter jdieter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@PlaidCat PlaidCat merged commit 1b9ea68 into rocky9_6 Sep 3, 2025
4 checks passed
@PlaidCat PlaidCat deleted the rocky9_6_rebuild branch September 3, 2025 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants