Skip to content

[LTS 9.2] igb: set max size RX buffer when store bad packet is enabled #297

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 30, 2025

Conversation

pvts-mat
Copy link
Contributor

[LTS 9.2]
CVE-2023-45871
VULN-6698

Problem

From the company which discovered the bug:

https://www.omicronenergy.com/download/file/5ddf37266b0d79a7ba5818893202d9c1/

Linux Kernel vulnerability CVE-2023-45871 allows an attacker to cause memory corruption in the network driver of the *BX device by sending special crafted network traffic. The behaviour of the system caused by memory corruption is highly unpredictable: the device is either restarted, processes crash, or a manual reboot is required.

The CVSS 3.1 scoring is somewhat inconsistent, ranging from 7.5 (nist) to 9.8 (above).

Applicability

The igb module is enabled in ciqlts9_2

configs/kernel-x86_64-rhel.config:

CONFIG_IGB_DCA=y
CONFIG_IGB_HWMON=y
CONFIG_IGB=m
CONFIG_IGBVF=m

Solution

The mainline fix is given in bb5ed01. It was backported to multiple stable versions without any changes, also to CBR 7.9 in d3573f5, LTS 8.6 in 6ef78b9 and LTS 9.4 in aee509a (by RedHat). The commit applies to ciqlts9_2 without any modifications as well.

kABI check: passed

DEBUG=1 CVE=CVE-2023-45871 ./ninja.sh _kabi_checked__$(uname -m)--test--ciqlts9_2-CVE-2023-45871

[0/1] Check ABI of kernel [ciqlts9_2-CVE-2023-45871]
++ uname -m
+ python3 /data/src/ctrliq-github/kernel-dist-git-el-9.2/SOURCES/check-kabi -k /data/src/ctrliq-github/kernel-dist-git-el-9.2/SOURCES/Module.kabi_x86_64 -s vms/x86_64--build--ciqlts9_2/build_files/kernel-src-tree-ciqlts9_2-CVE-2023-45871/Module.symvers
kABI check passed
+ touch state/kernels/ciqlts9_2-CVE-2023-45871/x86_64/kabi_checked

Boot test: passed

boot-test.log

Kselftests: passed relative

Coverage

bpf (except test_progs, test_progs-no_alu32, test_xsk.sh, test_sockmap, test_kmod.sh), breakpoints, capabilities, cgroup (except test_memcontrol), clone3, core, cpu-hotplug, cpufreq, drivers/dma-buf, drivers/net/bonding, drivers/net/team, filesystems/binderfs, firmware, fpu, ftrace, futex, gpio, intel_pstate, ipc, ir, kcmp, kexec, kvm, landlock, lib, livepatch, membarrier, memfd, memory-hotplug, mincore, mount, mqueue, nci, net/forwarding (except dual_vxlan_bridge.sh, vxlan_bridge_1d_ipv6.sh, tc_police.sh, sch_ets.sh, sch_tbf_ets.sh, mirror_gre_vlan_bridge_1q.sh, mirror_gre_bridge_1d_vlan.sh, tc_actions.sh, sch_red.sh, ipip_hier_gre_keys.sh, q_in_vni.sh, sch_tbf_root.sh, sch_tbf_prio.sh), net/mptcp (except simult_flows.sh, userspace_pm.sh), net (except reuseport_addr_any.sh, gro.sh, ip_defrag.sh, udpgso_bench.sh, fib_nexthops.sh, udpgro_fwd.sh, reuseaddr_conflict, xfrm_policy.sh, txtimestamp.sh), netfilter (except nft_trans_stress.sh), nsfs, openat2, pid_namespace, pidfd, proc (except proc-pid-vm), pstore, ptrace, rlimits, rseq, seccomp, sgx, sigaltstack, size, splice, static_keys, syscall_user_dispatch, tc-testing, tdx, timens, timers (except raw_skew), tmpfs, tpm2, vDSO, vm, x86, zram

Reference

kselftests–ciqlts9_2–run1.log
kselftests–ciqlts9_2–run2.log
kselftests–ciqlts9_2–run3.log

Patch

kselftests–ciqlts9_2-CVE-2023-45871–run1.log
kselftests–ciqlts9_2-CVE-2023-45871–run2.log
kselftests–ciqlts9_2-CVE-2023-45871–run3.log

Comparison

Some differences in test results may be observed, though they're not specific to the patch nor related to the modified module. Discussion below.

$ ./ktests.xsh diff -d kselftests*.log

Column    File
--------  ----------------------------------------------
Status0   kselftests--ciqlts9_2--run1.log
Status1   kselftests--ciqlts9_2--run2.log
Status2   kselftests--ciqlts9_2--run3.log
Status3   kselftests--ciqlts9_2-CVE-2023-45871--run1.log
Status4   kselftests--ciqlts9_2-CVE-2023-45871--run2.log
Status5   kselftests--ciqlts9_2-CVE-2023-45871--run3.log

TestCase              Status0  Status1  Status2  Status3  Status4  Status5  Summary
cgroup:test_freezer   pass     pass     fail     pass     pass     pass     diff
proc:proc-uptime-001           fail     fail     fail     fail     pass     diff

Note that igb doesn't have any selftests defined.

Differences highlights

proc:proc-uptime-001

The test finishes with the proc-uptime-001 binary failing an assert, dumping core and failing the test

# proc-uptime-001: proc-uptime-001.c:39: main: Assertion `i1 >= i0' failed.
# /usr/bin/timeout: the monitored command dumped core
# ./kselftest/runner.sh: line 33: 194730 Aborted                 /usr/bin/timeout --foreground "$kselftest_timeout" "$1"
not ok 1 selftests: proc: proc-uptime-001 # exit=134

Looking at the history turns out this is pretty common occurence, on multiple architectures

$ sqlite3 db/tests-results.sqlite <<EOF
.mode box
SELECT arch, res, count(*) FROM main
WHERE test = 'proc:proc-uptime-001' AND ver = 'ciqlts9_2'
GROUP BY arch, res
EOF

┌─────────┬──────┬──────────┐
│  arch   │ res  │ count(*) │
├─────────┼──────┼──────────┤
│ aarch64 │ fail │ 5        │
│ aarch64 │ pass │ 9        │
│ x86_64  │ fail │ 27       │
│ x86_64  │ pass │ 1        │
└─────────┴──────┴──────────┘

Interestingly enough, this problem started occuring from version 9.2, in pre-9.2 the proc-uptime-001 working fine

$ sqlite3 db/tests-results.sqlite <<EOF
.mode box
SELECT ver, res, count(*) FROM main
WHERE test = 'proc:proc-uptime-001'
GROUP BY ver, res
EOF

┌───────────┬──────┬──────────┐
│    ver    │ res  │ count(*) │
├───────────┼──────┼──────────┤
│ ciqlts8_6 │ pass │ 38       │
│ ciqlts8_8 │ pass │ 47       │
│ ciqlts9_2 │ fail │ 32       │
│ ciqlts9_2 │ pass │ 10       │
│ ciqlts9_4 │ fail │ 15       │
│ ciqlts9_4 │ pass │ 5        │
└───────────┴──────┴──────────┘

Tagging proc:proc-uptime-001 for LTS 9.2 and LTS 9.4 as flappy in rocky.yml for now until some more meaningful investigation later.

cgroup:test_freezer

The test fails with /sys/fs/cgroup/cg_test_ptrace not being frozen

$ ./ktests.xsh show_groups --test cgroup:test_freezer -s kselftests*.log

kselftests--ciqlts9_2--run1.log:
kselftests--ciqlts9_2--run2.log:
kselftests--ciqlts9_2-CVE-2023-45871--run1.log:
kselftests--ciqlts9_2-CVE-2023-45871--run2.log:
kselftests--ciqlts9_2-CVE-2023-45871--run3.log:
cgroup:test_freezer:
# ok 1 test_cgfreezer_simple
# ok 2 test_cgfreezer_tree
# ok 3 test_cgfreezer_forkbomb
# ok 4 test_cgfreezer_mkdir
# ok 5 test_cgfreezer_rmdir
# ok 6 test_cgfreezer_migrate
# ok 7 test_cgfreezer_ptrace
# ok 8 test_cgfreezer_stopped
# ok 9 test_cgfreezer_ptraced
# ok 10 test_cgfreezer_vfork
ok 1 selftests: cgroup: test_freezer

kselftests--ciqlts9_2--run3.log:
cgroup:test_freezer:
# ok 1 test_cgfreezer_simple
# ok 2 test_cgfreezer_tree
# ok 3 test_cgfreezer_forkbomb
# ok 4 test_cgfreezer_mkdir
# ok 5 test_cgfreezer_rmdir
# ok 6 test_cgfreezer_migrate
# Cgroup /sys/fs/cgroup/cg_test_ptrace isn't frozen
# not ok 7 test_cgfreezer_ptrace
# ok 8 test_cgfreezer_stopped
# ok 9 test_cgfreezer_ptraced
# ok 10 test_cgfreezer_vfork
not ok 1 selftests: cgroup: test_freezer # exit=1

The problem was always with /sys/fs/cgroup/cg_test_ptrace in all the historic 53 test runs so far

$ sqlite3 db/tests-results.sqlite <<EOF
.mode line
SELECT res, count(*) AS cnt, contents_body
FROM main
WHERE test = 'cgroup:test_freezer' AND ver = 'ciqlts9_2'
GROUP BY res, contents_body
EOF

          res = fail
          cnt = 4
contents_body = # ok 1 test_cgfreezer_simple
# ok 2 test_cgfreezer_tree
# ok 3 test_cgfreezer_forkbomb
# ok 4 test_cgfreezer_mkdir
# ok 5 test_cgfreezer_rmdir
# ok 6 test_cgfreezer_migrate
# Cgroup /sys/fs/cgroup/cg_test_ptrace isn't frozen
# not ok 7 test_cgfreezer_ptrace
# ok 8 test_cgfreezer_stopped
# ok 9 test_cgfreezer_ptraced
# ok 10 test_cgfreezer_vfork

          res = pass
          cnt = 49
contents_body = # ok 1 test_cgfreezer_simple
# ok 2 test_cgfreezer_tree
# ok 3 test_cgfreezer_forkbomb
# ok 4 test_cgfreezer_mkdir
# ok 5 test_cgfreezer_rmdir
# ok 6 test_cgfreezer_migrate
# ok 7 test_cgfreezer_ptrace
# ok 8 test_cgfreezer_stopped
# ok 9 test_cgfreezer_ptraced
# ok 10 test_cgfreezer_vfork

This behavior cannot be meaningfully compared to other versions as cgroup:test_freezer is skipped on Rockys < 9.2 with

# 1..0 # SKIP cgroup v2 isn't mounted

and for ciqlts9_4 the sample is too small (4):

$ sqlite3 db/tests-results.sqlite <<EOF
.mode box
SELECT ver, res, count(*) AS cnt
FROM main
WHERE test = 'cgroup:test_freezer'
GROUP BY ver, res
EOF

┌───────────┬──────┬─────┐
│    ver    │ res  │ cnt │
├───────────┼──────┼─────┤
│ ciqlts8_8 │ skip │ 70  │
│ ciqlts9_2 │ fail │ 4   │
│ ciqlts9_2 │ pass │ 49  │
│ ciqlts9_4 │ pass │ 4   │
└───────────┴──────┴─────┘

Tagging cgroup:test_freezer as flappy for version ciqlts9_2 in rocky.yml until some more meaningful investigation later.

Specific tests: skipped

jira VULN-6698
cve CVE-2023-45871
commit-author Radoslaw Tyl <[email protected]>
commit bb5ed01

Increase the RX buffer size to 3K when the SBP bit is on. The size of
the RX buffer determines the number of pages allocated which may not
be sufficient for receive frames larger than the set MTU size.

	Cc: [email protected]
Fixes: 89eaefb ("igb: Support RX-ALL feature flag.")
	Reported-by: Manfred Rudigier <[email protected]>
	Signed-off-by: Radoslaw Tyl <[email protected]>
	Tested-by: Arpana Arland <[email protected]> (A Contingent worker at Intel)
	Signed-off-by: Tony Nguyen <[email protected]>
	Signed-off-by: David S. Miller <[email protected]>
(cherry picked from commit bb5ed01)
	Signed-off-by: Marcin Wcisło <[email protected]>
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

@bmastbergen bmastbergen merged commit aceebde into ctrliq:ciqlts9_2 May 30, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants