Skip to content

Fix fsync race on shutdown #7070

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ jobs:
git config --global --add safe.directory /__w/CCF/CCF
mkdir build
cd build
cmake -GNinja -DCOMPILE_TARGET=snp -DCMAKE_BUILD_TYPE=Debug ..
cmake -GNinja -DCOMPILE_TARGET=snp -DCMAKE_BUILD_TYPE=Debug -DSAN=ON ..
ninja
shell: bash

Expand Down
7 changes: 0 additions & 7 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -1674,13 +1674,11 @@ Key-Value Store
#### Certificate(s) Validity Period

- Nodes certificates validity period is no longer hardcoded and must instead be set by operators and renewed by members (#2924):

- The new `node_certificate.initial_validity_days` (defaults to 1 day) configuration entry lets operators set the initial validity period for the node certificate (valid from the current system time).
- The new `command.start.service_configuration.maximum_node_certificate_validity_days` (defaults to 365 days) configuration entry sets the maximum validity period allowed for node certificates.
- The new `set_node_certificate_validity` proposal action allows members to renew a node certificate (or `set_all_nodes_certificate_validity` equivalent action to renew _all_ trusted nodes certificates).

- Service certificate validity period is no longer hardcoded and must instead be set by operators and renewed by members (#3363):

- The new `service_certificate_initial_validity_days` (defaults to 1 day) configuration entry lets operators set the initial validity period for the service certificate (valid from the current system time).
- The new `maximum_service_certificate_validity_days` (defaults to 365 days) configuration entry sets the maximum validity period allowed for service certificate.
- The new `set_service_certificate_validity` proposal action allows members to renew the service certificate.
Expand Down Expand Up @@ -1961,13 +1959,11 @@ Key-Value Store
#### Certificate(s) Validity Period

- Nodes certificates validity period is no longer hardcoded and must instead be set by operators and renewed by members (#2924):

- The new `node_certificate.initial_validity_days` (defaults to 1 day) configuration entry lets operators set the initial validity period for the node certificate (valid from the current system time).
- The new `command.start.service_configuration.maximum_node_certificate_validity_days` (defaults to 365 days) configuration entry sets the maximum validity period allowed for node certificates.
- The new `set_node_certificate_validity` proposal action allows members to renew a node certificate (or `set_all_nodes_certificate_validity` equivalent action to renew _all_ trusted nodes certificates).

- Service certificate validity period is no longer hardcoded and must instead be set by operators and renewed by members (#3363):

- The new `service_certificate_initial_validity_days` (defaults to 1 day) configuration entry lets operators set the initial validity period for the service certificate (valid from the current system time).
- The new `maximum_service_certificate_validity_days` (defaults to 365 days) configuration entry sets the maximum validity period allowed for service certificate.
- The new `set_service_certificate_validity` proposal action allows members to renew the service certificate.
Expand Down Expand Up @@ -2710,17 +2706,14 @@ The 1.0 release will require minimal changes from this release.
### Added

- Experimental

- New CCF nodes can now join from a [snapshot](https://microsoft.github.io/CCF/ccf-0.13.0/operators/start_network.html#resuming-from-existing-snapshot) (#1500, #1532)
- New KV maps can now be created dynamically in a transaction (#1507, #1528)

- CLI

- Subject Name and Subject Alternative Names for the node certificates can now be passed to cchost using the --sn and --san CLI switches (#1537)
- Signature and ledger splitting [flags](https://microsoft.github.io/CCF/ccf-0.13.0/operators/start_network.html#signature-interval) have been renamed more accurately (#1534)

- Governance

- `user_data` can be set at user creation, as well as later (#1488)

- Javascript
Expand Down
3 changes: 3 additions & 0 deletions src/host/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -997,10 +997,13 @@ int main(int argc, char** argv) // NOLINT(bugprone-exception-escape)
// callbacks to be despatched, so as to avoid memory being
// leaked by handles. Capped out of abundance of caution.
constexpr size_t max_iterations = 1000;
constexpr size_t max_delay_us = 1'000'000; // 1 second
size_t close_iterations = max_iterations;
while ((uv_loop_alive(uv_default_loop()) != 0) && (close_iterations > 0))
{
uv_run(uv_default_loop(), UV_RUN_NOWAIT);
constexpr size_t per_iteration_sleep_length = max_delay_us / max_iterations;
usleep(per_iteration_sleep_length);
close_iterations--;
}
LOG_INFO_FMT(
Expand Down
1 change: 1 addition & 0 deletions src/snapshots/snapshot_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,7 @@ namespace snapshots
asynchost::TimeBoundLogger log_if_slow(
fmt::format("Committing snapshot - fsync({})", data->tmp_file_name));
fsync(data->snapshot_fd);
usleep(50000); // Force a delayed fsync of 50ms
}

close(data->snapshot_fd);
Expand Down
Loading