zfs-2.3.3 patchset #17468

behlendorf · 2025-06-17T17:57:31Z

Motivation and Context

Proposed zfs-2.3.3 patchset based on the commits in the zfs-2.3.3-staging branch.

Description

Bump META file to support 6.15 kernel, zfs send/recv encryption fixes, misc fixes.

How Has This Been Tested?

Previously tested by CI, well be retested in this PR.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Performance enhancement (non-breaking change which improves efficiency)
Code cleanup (non-breaking change which makes code smaller or more readable)
Quality assurance (non-breaking change which makes the code more robust against bugs)
Breaking change (fix or feature that would cause existing functionality to change)
Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
Documentation (a change to man pages or other documentation)

Checklist:

My code follows the OpenZFS code style requirements.
I have updated the documentation accordingly.
I have read the contributing document.
I have added tests to cover my changes.
I have run the ZFS Test Suite with this change applied.
All commit messages are properly formatted and contain Signed-off-by.

Previous implementation of zap_leaf_array_free() put chunks on the free list in reverse order. Also zap_leaf_transfer_entry() and zap_entry_remove() were freeing name and value arrays in reverse order. Together this created a mess in the free list, making following allocations much more fragmented than necessary. This patch re-implements zap_leaf_array_free() to keep existing chunks order, and implements non-destructive zap_leaf_array_copy() to be used in zap_leaf_transfer_entry() to allow properly ordered freeing name and value arrays there and in zap_entry_remove(). With this change test of some writes and deletes shows percent of non-contiguous chunks in DDT reducing from 61% and 47% to 0% and 17% for arrays and frees respectively. Sure some explicit sorting could do even better, especially for ZAPs with variable-size arrays, but it would also cost much more, while this should be very cheap. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#16766 (cherry picked from commit 9a81484)

Since embedded blocks introduction 11 years ago, their writing was blocked if dedup is enabled. After searching through the modern code I see no reason for this restriction to exist. Same time embedded blocks are dramatically cheaper. Even regular write of so small blocks would likely be cheaper than deduplication, even if the last is successful, not mentioning otherwise. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#17113 (cherry picked from commit 09f4dd0)

@ImAwsumm

It seems `fio` in `ddt_dedup_vdev_limit` overwhelms the system with the amount of dirty data caused by DDT updates within one TXG due to tiny 1KB records used, while I see no reason for this test to extend the TXGs beyond default. Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: @ImAwsumm Reviewed-by: Tony Hutter <[email protected]> (cherry picked from commit 21850f5)

dbuf_prefetch_impl() should look on level of current indirect, not the target prefetch level. dbuf_prefetch_indirect_done() should call dnode_level_is_l2cacheable() if we have dpa_dnode to pass it. It should fix some both false positive and negative L2ARC caching. While there, fix redacted feature activation assertions. One was always true, while another could give false positive if dpa_dnode is NULL. George Amanakis <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#17204 (cherry picked from commit a497c5f)

…nzfs#17228) Various tools will display draid vdev names with parameters embedded in them, but would not accept them as valid vdev names when looking them up, making it difficult to build pipelines involving draid vdevs. This commit makes it so that if a full draid name is offered for match, it gets truncated at the first ':' character. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tony Hutter <[email protected]> (cherry picked from commit 131df3b)

@ImAwsumm

- Fix VERIFY3B() when given non-boolean values. - Map EQUIV() into VERIFY3B(,==,) as equivalent. - Tune messages for better readability and to closer match source code for easier search. Unify user-space messages with kernel. - Tune printed types and remove %px outside of Linux kernel. Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Reviewed-by: @ImAwsumm Reviewed-by: Rob Norris <[email protected]> Reviewed-by: Tony Hutter <[email protected]> (cherry picked from commit 4866c2f)

The test writes 1M of 1KB blocks, which may produce up to 1GB of dirty data. On top of that ashift=12 likely produces additional 4GB of ZIO buffers during sync process. On top of that we likely need some page cache since the pool reside on files. And finally we need to cache the DDT. Not surprising that the test regularly ends up in OOMs, possibly depending on TXG size variations. Also replace fio with pretty strange parameter set with a set of dd writes and TXG commits, just as we neeed here. While here, remove compression. It has nothing to do here, but waste CI CPU time. Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Tony Hutter <[email protected]> (cherry picked from commit 1d8f625)

- Kill workload first for faster cleanup. - Use `zpool wait` for resilver instead of `sleep`. - Remove irrelevant workload from `online_offline_003_neg`. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes: openzfs#17259 (cherry picked from commit cb49e77)

Replace `sleep 15` with `zpool wait`, which should take much less than the 15 seconds. And considering it is called 16 times, this should save us up to 4 minutes total. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes: openzfs#17257 (cherry picked from commit 8f2c2de)

All defined variants: - freebsd13-4r, freebsd13-5r, freebsd14-1r, freebsd14-2r (RELEASE) - freebsd13-5s, freebsd14-2s (STABLE) - freebsd15-0c (CURRENT) Used for testing: - freebsd13-4r (RELEASE) - freebsd14-2s (STABLE) - freebsd15-0c (CURRENT) Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes: openzfs#17260 (cherry picked from commit 6afb405)

Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes: openzfs#17267 (cherry picked from commit 38c3a8b)

Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed by: Attila Fülöp <[email protected]> Signed-off-by: Artem-OSSRevival <[email protected]> Fixes: openzfs#14538 Closes: openzfs#17234 (cherry picked from commit 37a3e26)

Originally the Lustre ZFS OSD code was going to use zfs_uio_t structs for supporting Direct I/O with ZFS. However, this has changed to using abd_t structs instead. This exports the proper symbols that will be used by the Lustre ZFS OSD code. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Brian Atkinson <[email protected]> Closes openzfs#17256 (cherry picked from commit 7031a48)

This commit adds support for using llvm-libunwind for kernels built using llvm and clang. The two differences are that the largest register index is given by _LIBUNWIND_HIGHEST_DWARF_REGISTER, we need to check whether the register is a floating point register and the prototype for unw_regname takes the unwind cursor as the first argument. Reviewed-by: Rob Norris <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Sebastian Pauka <[email protected]> Closes openzfs#17230 (cherry picked from commit 1b4826b)

Those tests are write-mostly at the nested pool. Considering we have 3 more layers of caching underneath, we can hint ZFS how to use the memory better by setting primarycache=metadata. While there, add missing zpool sync after rm in checkpoint_capacity before we could potentially see the freed space, would not there be a pool checkpoint. Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Tony Hutter <[email protected]> (cherry picked from commit 1ef706c)

Sometimes it fails unable to see any injected write errors. I guess writing 25KB of zeroes might be not enough to trigger errors with probability set to 10%. Lets try to write more. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#17270 (cherry picked from commit d947b9a)

Reviewed-by: George Melikov <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#17278 (cherry picked from commit 88ec6c4)

Reviewed-by: George Melikov <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Alexander Ziaee <[email protected]> Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Quentin Thébault <[email protected]> Closes openzfs#17282 (cherry picked from commit 63de2d2)

Don't use KSM on the FreeBSD VMs and optimize KSM settings for Linux to have faster run times. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#17247 (cherry picked from commit ba17ced)

### Background Various admin operations will be invoked by some userspace task, but the work will be done on a separate kernel thread at a later time. Snapshots are an example, which are triggered through zfs_ioc_snapshot() -> dsl_dataset_snapshot(), but the actual work is from a task dispatched to dp_sync_taskq. Many such tasks end up in dsl_enforce_ds_ss_limits(), where various limits and permissions are enforced. Among other things, it is necessary to ensure that the invoking task (that is, the user) has permission to do things. We can't simply check if the running task has permission; it is a privileged kernel thread, which can do anything. However, in the general case it's not safe to simply query the task for its permissions at the check time, as the task may not exist any more, or its permissions may have changed since it was first invoked. So instead, we capture the permissions by saving CRED() in the user task, and then using it for the check through the secpolicy_* functions. ### Current implementation The current code calls CRED() to get the credential, which gets a pointer to the cred_t inside the current task and passes it to the worker task. However, it doesn't take a reference to the cred_t, and so expects that it won't change, and that the task continues to exist. In practice that is always the case, because we don't let the calling task return from the kernel until the work is done. For Linux, we also take a reference to the current task, because the Linux credential APIs for the most part do not check an arbitrary credential, but rather, query what a task can do. See secpolicy_zfs_proc(). Again, we don't take a reference on the task, just a pointer to it. ### Changes We change to calling crhold() on the task credential, and crfree() when we're done with it. This ensures it stays alive and unchanged for the duration of the call. On the Linux side, we change the main policy checking function priv_policy_ns() to use override_creds()/revert_creds() if necessary to make the provided credential active in the current task, allowing the standard task-permission APIs to do the needed check. Since the task pointer is no longer required, this lets us entirely remove secpolicy_zfs_proc() and the need to carry a task pointer around as well. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Pavel Snajdr <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Kyle Evans <[email protected]> Reviewed-by: Tony Hutter <[email protected]> (cherry picked from commit c8fa39b)

This change goes through and quotes variables where appropriate to avoid issues with incorrect splitting. The performance tests ran into an issue with $SUDO_COMMAND splitting incorrectly because it was not quoted. This change fixes that issue and hopefully gets ahead of any other similar problems. Reviewed by: John Wren Kennedy <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Aleksandr Liber <[email protected]> Closes openzfs#17235 (cherry picked from commit aa46cc9)

When multiple snapshots prevent the destruction/rollback of the respective dataset/snapshot/volume via zfs destroy or zfs rollback, the error message does not list the blocking snapshots sorted according to their order of creation. This causes inconvenience and can lead to confusion, and also creates a contrast with a returned message from zfs list -t snap function. Closes: openzfs#12751 Signed-off-by: Artem-OSSRevival <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tony Hutter <[email protected]> (cherry picked from commit 27f3d94)

Decrease the RESILVER_MIN_TIME_MS variable from 50 to 20. So the test, which expects two 2 resilver starts will see them. Logfile of the seen failures before this fix: log: NOTE: expected 2 resilver start(s) after offline/online, found 1 log: expected 2 resilver start(s) after offline/online, found 1 The test time decreases also from around 00:42 to 00:24 seconds. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#16822 Closes openzfs#17279 (cherry picked from commit 3b18877)

It's possible for two spares to get attached to a single failed vdev. This happens when you have a failed disk that is spared, and then you replace the failed disk with a new disk, but during the resilver the new disk fails, and ZED kicks in a spare for the failed new disk. This commit checks for that condition and disallows it. Reviewed-by: Akash B <[email protected]> Reviewed-by: Ameer Hamza <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes: openzfs#16547 Closes: openzfs#17231 (cherry picked from commit f40ab9e)

zpool_status_003 and zpool_status_004_pos use 'dd' to trigger a read of a file without specifying 'of=/dev/null'. This spams the ZTS logs with ~20MB of garbage data. This commit adds 'of=/dev/null'. Signed-off-by: Tony Hutter <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Alexander Motin <[email protected]> (cherry picked from commit a6cca8a)

`S_IFMT` is declared in `sys/stat.h`, but we cannot include this header because it redeclares the `statx` function with different argument types. Therefore, we define `S_IFMT` ourselves, in the same way as the other definitions. Reviewed-by: Rob Norris <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: José Luis Salvador Rufo <[email protected]> Closes openzfs#17293 Closes openzfs#17294 (cherry picked from commit 634c172)

We should not clear scn_state and notify waiters until we call vdev_dtl_reassess(), otherwise following offline/detach request may fail with "no valid replicas". Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. (cherry picked from commit f86d9af)

After more CI runs and code reading after openzfs#17259 I've found that online starts resilver via async mechanism, which does not provide wait primitives at this time. Restore some delays to restore CI until this is properly fixed. Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. (cherry picked from commit f85c96e)

…nzfs#17284) txg_wait_synced_sig() is "wait for txg, unless a signal arrives". We expect that future development will require similar "wait unless X" behaviour. This generalises the API as txg_wait_synced_flags(), where the provided flags describe the events that should cause the call to return. Instead of a boolean, the return is now an error code, which the caller can use to know which event caused the call to return. The existing call to txg_wait_synced_sig() is now txg_wait_synced_flags(TXG_WAIT_SIGNAL). Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Alexander Motin <[email protected]> (cherry picked from commit a7de203)

With certain combinations of target ARC states balance and ghost hit rates it was possible to get the fractions outside of allowed range. This patch limits maximum balance adjustment speed, which should make it impossible, and also asserts it. Fixes openzfs#17210 Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Reviewed-by: Rob Norris <[email protected]> Reviewed-by: Tony Hutter <[email protected]> (cherry picked from commit b1ccab1)

Fedora 40 has gone EOL as of May 2025, retire the CI builder. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes openzfs#17408

After openzfs#17401 the Linux build produces some stack related warnings. Silence them with the `STACK_FRAME_NON_STANDARD` macro. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Co-authored-by: Brian Behlendorf <[email protected]> Closes openzfs#17410

The io_uring interface is available as a Technology Preview. Details: https://access.redhat.com/solutions/4723221 Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#17447

Having high-refcount dedup entries for zero blocks is inefficient when they could be recorded as a holes instead. Normally, zero compression is not done if compression is disabled to not confuse naive benchmarks. But with dedup enabled, it is expected that the write will be skipped anyway, so we are just optimizing the way it is skipped. Reviewed-by: Rob Norris <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#17435

It is not right, but there are few examples when TX is aborted after being assigned in case of error. To handle it better on production systems add extra cleanup steps. While here, replace couple dmu_tx_abort() in simple cases. Reviewed-by: Rob Norris <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#17438

Looking on txg_wait_synced(, 0) I've noticed that it always syncs 5 TXGs: 3 TXG_CONCURRENT_STATES + 2 TXG_DEFER_SIZE. But in case of dmu_offset_next() we do not care about deferred frees. And even concurrent TXGs we might not need sync all 3 if the dnode was not dirtied in last few TXGs. This patch makes dmu_offset_next() to sync one TXG at a time until the dnode is clean, but no more than 3 TXG_CONCURRENT_STATES times. My tests with random simultaneous writes and seeks over many files on HDD pool show 7-14% performance increase. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#17434

Previous dmu_tx_count_clone() was broken, stating that cloning is similar to free. While they might be from some points, cloning is not net-free. It will likely consume space and memory, and unlike free it will do it no matter whether the destination has the blocks or not (usually not, so previous code did nothing). Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#17431

Fairly coarse, but if it returns while the pool suspends, it must be with an error. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#17420

The superblock pointer will always be set, as will z_log, so remove code supporting cases that can't occur (on Linux at least). Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#17420

If the pool is suspended, we'll just block in zil_commit(). If the system is shutting down, blocking wouldn't help anyone. So, we should keep this test for now, but at least return an error for anyone who is actually interested. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#17420

If the kernel will honour our error returns, use them. If not, fool it by setting a writeback error on the superblock, if available. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#17420

If a write is split across mutliple itxs, we only want the callback on the last one, otherwise it will be called for every itx associated with this single write, which makes it very hard to know what to clean up. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Mark Johnston <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#17445

zfs_putpages() would put the entire range of pages onto the ZIL, then return VM_PAGER_OK for each page to the kernel. However, an associated zil_commit() or txg sync had not happened at this point, so the write may not actually be on disk. So, we rework it to use a ZIL commit callback, and do the post-write work of undirtying the page and signaling completion there. We return VM_PAGER_PEND to the kernel instead so it knows that we will take care of it. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Mark Johnston <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#17445

- Set/remove "Work in Progress"/"Code Review Needed" for drafts. - Remove "Accepted", "Inactive", "Revision Needed" and "Stale" on pushes and reopens. I hope this reduce chances of PRs being forgotten after requested modifications done due to stale labels. It is better to have no labels than incorrect ones saying there is nothing to look at. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#16721

This also includes removing L2 vdevs asynchronously. This commit also guarantees that spa_load_guid is unique. The zpool reguid feature introduced the spa_load_guid, which is a transient value used for runtime identification purposes in the ARC. This value is not the same as the spa's persistent pool guid. However, the value is seeded from spa_generate_load_guid() which does not check for uniqueness against the spa_load_guid from other pools. Although extremely rare, you can end up with two different pools sharing the same spa_load_guid value! So we guarantee that the value is always unique and additionally not still in use by an async arc flush task. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes openzfs#16215

On systems with enormous amounts of memory, the single arc_evict thread can become a bottleneck if reads and writes are stuck behind it, waiting for old data to be evicted before new data can take its place. This commit adds support for evicting from multiple ARC lists in parallel, by farming the evict work out to some number of threads and then accumulating their results. A new tuneable, zfs_arc_evict_threads, sets the number of threads. By default, it will scale based on the number of CPUs. Sponsored-by: Expensify, Inc. Sponsored-by: Klara, Inc. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Youzhong Yang <[email protected]> Signed-off-by: Allan Jude <[email protected]> Signed-off-by: Mateusz Piotrowski <[email protected]> Signed-off-by: Alexander Stetsenko <[email protected]> Signed-off-by: Rob Norris <[email protected]> Co-authored-by: Rob Norris <[email protected]> Co-authored-by: Mateusz Piotrowski <[email protected]> Co-authored-by: Alexander Stetsenko <[email protected]> Closes openzfs#16486

It's been many years, we can probably do without. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Pavel Snajdr <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#17376

It makes no sense to limit read size below the block size, since DMU will any way consume resources for the whole block, while the current zfs_vnops_read_chunk_size is only 1MB, which is smaller that maximum block size of 16MB. Plus in case of misaligned Uncached I/O the buffer may get evicted between the chunks, requiring repeating I/Os. On 64-bit platforms increase zfs_vnops_read_chunk_size to 32MB. It allows to less depend on speculative prefetcher if application requests specific size, first not waiting for prefetcher to start and later not prefetching more than needed. Also while there, we don't need to align reads to the chunk size, but only to a block size, which is smaller and so more forgiving. My profiles show ~4% of CPU time saving when reading 16MB blocks. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Igor Kozhukhov <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes openzfs#17415

These are only required to support these ioctls on Linux <4.5. Since 4.18 is our cutoff, we don't need this code anymore. Also removing related test things that will never match again. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes openzfs#17308

Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Germano Massullo <[email protected]> Closes openzfs#17461

Update the META file to reflect compatibility with the 6.15 kernel. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes openzfs#17393

Signed-off-by: Rob Norris <[email protected]>

behlendorf · 2025-06-19T17:59:50Z

Merged and tagged 2.3.3 tagged.

https://github.com/openzfs/zfs/releases/tag/zfs-2.3.3

amotin and others added 30 commits May 28, 2025 16:00

ZTS: Fix 256MB file leak in zed_cksum_reported

7bb7ff7

Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes: openzfs#17267 (cherry picked from commit 38c3a8b)

ZTS: Use Ubuntu default url for cloud-image

a33e8b0

Reviewed-by: George Melikov <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#17278 (cherry picked from commit 88ec6c4)

behlendorf and others added 22 commits June 17, 2025 10:50

ZTS: Enable io_uring on CentOS Stream 9 and 10 also

ea3a600

The io_uring interface is available as a Technology Preview. Details: https://access.redhat.com/solutions/4723221 Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes openzfs#17447

Fix mixed-use-of-spaces-and-tabs rpmlint warning

777d8ee

Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Germano Massullo <[email protected]> Closes openzfs#17461

Linux 6.15 compat: META

faefa5f

Update the META file to reflect compatibility with the 6.15 kernel. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes openzfs#17393

META: 2.3.3

ddf066e

Signed-off-by: Rob Norris <[email protected]>

behlendorf added the Status: Code Review Needed Ready for review and testing label Jun 17, 2025

amotin approved these changes Jun 18, 2025

View reviewed changes

kerberizer mentioned this pull request Jun 18, 2025

upstream: ZFS v2.3.3 archzfs/archzfs#594

Merged

robn mentioned this pull request Jun 19, 2025

Ask for service release providing 6.15 compatibility #17439

Open

behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Jun 19, 2025

behlendorf closed this Jun 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

zfs-2.3.3 patchset #17468

zfs-2.3.3 patchset #17468

Uh oh!

behlendorf commented Jun 17, 2025 •

edited by amotin

Loading

Uh oh!

behlendorf commented Jun 19, 2025

Uh oh!

Uh oh!

zfs-2.3.3 patchset #17468

zfs-2.3.3 patchset #17468

Uh oh!

Conversation

behlendorf commented Jun 17, 2025 • edited by amotin Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Description

How Has This Been Tested?

Types of changes

Checklist:

Uh oh!

behlendorf commented Jun 19, 2025

Uh oh!

Uh oh!

behlendorf commented Jun 17, 2025 •

edited by amotin

Loading