Skip to content

Conversation

nrempel
Copy link
Contributor

@nrempel nrempel commented May 14, 2025

Closes #1660

This PR introduces the dsc cmd all-stopped command to reliably check if all "downstairs" services are stopped. This replaces previous sleep-based checks in tests, reducing potential flakiness.

Changes

  • Added AllStopped command to dsc/src/client.rs and corresponding handling.
  • Implemented the /allstopped GET endpoint in dsc/src/control.rs.
  • Added the all_stopped logic to DscInfo in dsc/src/main.rs, considering Running, Starting, or Stopping states as not stopped.
  • Updated openapi/dsc-control.json with the new endpoint definition.
  • Modified tools/test_dsc.sh to utilize the new all-stopped command.

All tests pass with these changes.

Copy link
Contributor

@leftwo leftwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay on this one, we got busy and I did not get time to look at it.

I was worried that the old test_dsc.sh had a retry loop that we no longer have (to check for stopped) so I ran this code in a test loop for the past few weeks to see if it ever failed to stop, and it never did. So, I'm no longer worried about that.

@leftwo leftwo merged commit 99778b3 into oxidecomputer:main May 27, 2025
17 checks passed
@nrempel
Copy link
Contributor Author

nrempel commented May 27, 2025

All good,thanks!

@nrempel nrempel deleted the dsc-cmd branch May 27, 2025 23:46
jmpesp added a commit to oxidecomputer/omicron that referenced this pull request Aug 11, 2025
Pick up the following propolis PRs:

- Bump crucible rev to latest (oxidecomputer/propolis#922)
- Added block_size for file backends in propolis_server (workers is optional) (oxidecomputer/propolis#917)

Pick up the following crucible PRs:

- Snapshots existing already are ok! (oxidecomputer/crucible#1759)
- Less verbose logging (oxidecomputer/crucible#1756)
- Remove unused `Vec<JoinHandle>` (oxidecomputer/crucible#1754)
- Split "check reconciliation state" from "start reconciliation" (oxidecomputer/crucible#1732)
- Improve `ClientIoTask` start logic (oxidecomputer/crucible#1731)
- Use data-bearing enum variants pattern in negotiation (oxidecomputer/crucible#1727)
- Make Downstairs stoppable (oxidecomputer/crucible#1730)
- Don't log every region's metadata (oxidecomputer/crucible#1729)
- Compute reconciliation from `ClientMap` instead of three clients (oxidecomputer/crucible#1726)
- Make Offline -> Faulted transition happen without reconnecting (oxidecomputer/crucible#1725)
- Remove `WaitActive` state during negotiation (oxidecomputer/crucible#1722)
- Add explicit `UpstairsState::Disabled` (oxidecomputer/crucible#1721)
- Print version on startup for pantry and agent (oxidecomputer/crucible#1723)
- Update test mem to also show physical space used by regions. (oxidecomputer/crucible#1724)
- Add AllStopped command and endpoint for downstairs status (oxidecomputer/crucible#1718)
- Update tests to honor REGION_SETS env if provided. (oxidecomputer/crucible#1720)
- Fix panic in `set_active_request` when client is in Stopping(Replacing) state (oxidecomputer/crucible#1717)
- DTrace updates (oxidecomputer/crucible#1715)
jmpesp added a commit to oxidecomputer/omicron that referenced this pull request Aug 11, 2025
Pick up the following propolis PRs:

- Bump crucible rev to latest (oxidecomputer/propolis#922)
- Added block_size for file backends in propolis_server (workers is optional) (oxidecomputer/propolis#917)

Pick up the following crucible PRs:

- Snapshots existing already are ok! (oxidecomputer/crucible#1759)
- Less verbose logging (oxidecomputer/crucible#1756)
- Remove unused `Vec<JoinHandle>` (oxidecomputer/crucible#1754)
- Split "check reconciliation state" from "start reconciliation" (oxidecomputer/crucible#1732)
- Improve `ClientIoTask` start logic (oxidecomputer/crucible#1731)
- Use data-bearing enum variants pattern in negotiation (oxidecomputer/crucible#1727)
- Make Downstairs stoppable (oxidecomputer/crucible#1730)
- Don't log every region's metadata (oxidecomputer/crucible#1729)
- Compute reconciliation from `ClientMap` instead of three clients (oxidecomputer/crucible#1726)
- Make Offline -> Faulted transition happen without reconnecting (oxidecomputer/crucible#1725)
- Remove `WaitActive` state during negotiation (oxidecomputer/crucible#1722)
- Add explicit `UpstairsState::Disabled` (oxidecomputer/crucible#1721)
- Print version on startup for pantry and agent (oxidecomputer/crucible#1723)
- Update test mem to also show physical space used by regions. (oxidecomputer/crucible#1724)
- Add AllStopped command and endpoint for downstairs status (oxidecomputer/crucible#1718)
- Update tests to honor REGION_SETS env if provided. (oxidecomputer/crucible#1720)
- Fix panic in `set_active_request` when client is in Stopping(Replacing) state (oxidecomputer/crucible#1717)
- DTrace updates (oxidecomputer/crucible#1715)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dsc cmd should have a "are all downstairs stopped" check
2 participants