Skip to content

Decayed log optional migration #9945

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 23, 2025

Conversation

ziggie1984
Copy link
Collaborator

@ziggie1984 ziggie1984 commented Jun 12, 2025

Builds on top of #9929.

This PR adds an Optional Migration which is by default set to true.

Looking for ACK or NACK.

Copy link
Contributor

coderabbitai bot commented Jun 12, 2025

Important

Review skipped

Auto reviews are limited to specific labels.

🏷️ Labels to auto review (1)
  • llm-review

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@ziggie1984 ziggie1984 self-assigned this Jun 12, 2025
@ziggie1984 ziggie1984 requested a review from Roasbeef June 12, 2025 21:04
@ziggie1984 ziggie1984 force-pushed the optional-migration branch from 233b756 to fae32e9 Compare June 12, 2025 21:06
@ziggie1984 ziggie1984 marked this pull request as ready for review June 12, 2025 21:07
@ziggie1984 ziggie1984 added this to the v0.19.2 milestone Jun 12, 2025
@ziggie1984 ziggie1984 force-pushed the optional-migration branch from fae32e9 to c9c0e45 Compare June 12, 2025 21:09
@ziggie1984 ziggie1984 requested a review from Copilot June 12, 2025 21:12
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces an optional migration for garbage collecting the decayed log while refactoring related configuration and migration handling in the codebase. Key changes include adding new configuration flags in sample-lnd.conf and lncfg/db.go; modifying htlcswitch components to incorporate a “reforward” flag along with updated logging; and integrating a new migration (migration34) into channeldb with accompanying tests and documentation updates.

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
sample-lnd.conf Adds new configuration comments and flags for decayed log GC
lncfg/db.go Replaces the old PruneRevocation flag with NoGcDecayedLog and NoPruneDecayedLog options
htlcswitch/mock.go Updates the DecodeHopIterator signature with an added bool parameter
htlcswitch/link.go Introduces early logging for completed fwdPkgs and refactors the decoding flow
htlcswitch/hop/iterator.go Modifies the iterator decoding to account for reforwarding condition
htlcswitch/decayedlog.go Removes legacy bucket operations and returns nil instead
channeldb/options.go Updates the optional migration configuration and adds decayed log options
channeldb/migration34/* Adds a new migration module for decayed log garbage collection
channeldb/meta_test.go Adjusts migration testing to verify that all optional migrations run correctly
channeldb/db.go Integrates the new migration logic and updates the meta information accordingly
docs/release-notes/release-notes-0.19.2.md Updates release notes to include the new optional migration
Comments suppressed due to low confidence (1)

channeldb/options.go:26

  • The term 'OptionalMiragtionConfig' appears to be misspelled; it should be 'OptionalMigrationConfig' for clarity and consistency.
type OptionalMiragtionConfig struct {

@ziggie1984 ziggie1984 force-pushed the optional-migration branch 3 times, most recently from 99987d2 to e78253c Compare June 13, 2025 06:45
Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK on the approach.

@ziggie1984
Copy link
Collaborator Author

After migration:

Screenshot 2025-06-13 at 12 42 24

@saubyk
Copy link
Collaborator

saubyk commented Jun 16, 2025

This PR adds an Optional Migration which is by default set to true.

If it's an optional migration, shouldn't it be false by default?

@ziggie1984
Copy link
Collaborator Author

ziggie1984 commented Jun 16, 2025

If it's an optional migration, shouldn't it be false by default?

Good point, I decided to make it true by default because there is absolutely no reason to keep it (the data which is deleted in the migration). I choose the optional migration over the mandatory one overall because it is the less intrusive way into the code base, and because it is a minor release people can easy downgrade back to 19.1/19.0 if something is wrong with the migration. So I think it is totally fine having this default set to true.

- The replay protection is
[optimized](https://github.com/lightningnetwork/lnd/pull/9929) to use less disk
space such that the `sphinxreplay.db` or the `decayedlogdb_kv` table will grow
much slower.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
much slower.
much more slowly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Change belongs to #9929

Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good - my main comment is to split the commit a bit so we can focus on the details.

@ziggie1984 ziggie1984 force-pushed the optional-migration branch 4 times, most recently from 7f30276 to 2ee3115 Compare June 18, 2025 06:53
@ziggie1984
Copy link
Collaborator Author

ziggie1984 commented Jun 18, 2025

Ok this PR still has one Question to resolve:

This PR introduces a protection mechanism so that people upgrade to a newer version with a new Optional Migration applied cannot just downgrade their software because then we have to way to way to apply the optional migration again because we map it in the Optional Metadata. But we still need to solve the problem when people startup 19.2 and then for whatever reason downgrade to 19.1 again. This is not covered yet, probably we need to add a mandatory migration to prevent this ?

Or we make every Optional Migration idempotent and every migration makes sure it is idempotent when called more than once and therefore we can remove the Optional Metadata bucket.

@ziggie1984 ziggie1984 force-pushed the optional-migration branch from 2ee3115 to 53042bb Compare June 18, 2025 08:26
@ziggie1984
Copy link
Collaborator Author

ziggie1984 commented Jun 18, 2025

This PR introduces a protection mechanism so that people upgrade to a newer version with a new Optional Migration applied cannot just downgrade their software because then we have to way to way to apply the optional migration again because we map it in the Optional Metadata. But we still need to solve the problem when people startup 19.2 and then for whatever reason downgrade to 19.1 again. This is not covered yet, probably we need to add a mandatory migration to prevent this ?

Solved this by counting up the migration number for migration 34. So this change will disallow LND users to revert back to previous versions.

@ziggie1984 ziggie1984 force-pushed the optional-migration branch 2 times, most recently from f69a36b to 873ed94 Compare June 18, 2025 08:33
@ziggie1984 ziggie1984 requested a review from yyforyongyu June 18, 2025 08:33
Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question and I think it's good to go!

channeldb/db.go Outdated
// downgrading to a prior version, otherwise the
// decayed log db will not be properly cleaned up if
// a users has downgraded and then upgrades again.
number: 34,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to leave it uncleaned if the user decides to downgrade then upgrade - the optional migration mechanism should be removed once we are fully SQLized, it was created as the difficulty involved in the revocation log was huge, and we wanted to leave it to the users to decide to migrate or not based on their data size and machines. For this new migration tho, it's different as it's a shortcut for us to implement the migration, not that the migration itself is difficult.

I'm also not sure how this helps with the case when a user downgrades then upgrades, I think the optional meta has already been saved so it won't be applied again, and here the ver num is for mandatory migrations, which has no effect on optional ones? Essentially these are two independent migration mechanisms.

Copy link
Collaborator Author

@ziggie1984 ziggie1984 Jun 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure how this helps with the case when a user downgrades then upgrades, I think the optional meta has already been saved so it won't be applied again, and here the ver num is for mandatory migrations, which has no effect on optional ones? Essentially these are two independent migration mechanisms.

So if you run 19.2 we increase the db version which means we cannot revert back to LND 19.1 (even if the user does not apply the optional migration). However if he applies (and it is default to true) we can be sure that he is not reverting back.

I think it's fine to leave it uncleaned if the user decides to downgrade then upgrade - the optional migration mechanism should be removed once we are fully SQLized, it was created as the difficulty involved in the revocation log was huge, and we wanted to leave it to the users to decide to migrate or not based on their data size and machines. For this new migration tho, it's different as it's a shortcut for us to implement the migration, not that the migration itself is difficult.

Let's see what the second review thinks and then we decide (open for both approaches).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if you run 19.2 we increase the db version which means we cannot revert back to LND 19.1 (even if the user does not apply the optional migration). However if he applies (and it is default to true) we can be sure that he is not reverting back.

Ok I just realized this is a powerful yet dangerous mechanism to stop people from downgrading nodes, something we didn't have before. In theory we could have this mechanism every time we create a release, tho I'm unsure atm if that's what the users want. If we have this, it means not only can they not downgrade from 19.2 to 19.1, but also that it is not possible to downgrade from 20 to 19.x.

In reality tho, we already stopped users from downgrading anyway whenever there's a mandatory migration, so by adding this just makes the optional migration a bit more mandatory, in the sense that the user cannot go back to a previous version where we still save the old data, which now we no longer use.

@ziggie1984 ziggie1984 force-pushed the optional-migration branch 2 times, most recently from 23f9148 to 511cc46 Compare June 18, 2025 12:53
Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM🦾

Some context for other reviewers on why we chose the optional migration route,

  • the mandatory migration is built for channel.db only, while here we need to work on the sphinxreplay.db. So there's a gap already.
  • to make it work in mandatory migration is pretty straightforward - we can just add a new field to mandatoryVersion to instruct which db it should work on, without touching any of the previous migration methods.
  • the core issue is, again, the mandatory migration is made for channel.db only, which means the meta is saved there. In addition, we use a single tx to perform the migration and save its version num to the meta (putMeta), and this tx is derived from the db we are working on, which is channel.db. Unless we want to break the atomicity there, there's no way to make this a mandatory migration.

So I think it's fine to emulate a mandatory migration using this optional approach, otherwise we will need to expand the scope to redesign the mandatory migration to allow multiple DBs. The only risk I'm seeing here is, if a new unrelated bug pops up in 19.2, and the user wants to downgrade to a previous version, it would now be impossible. So a second opinion here is appreciated.

@ziggie1984 ziggie1984 force-pushed the optional-migration branch from 511cc46 to e0d95ca Compare June 19, 2025 14:43
@ziggie1984
Copy link
Collaborator Author

Talked also to @guggero and the way forward here. We decided to make this migration truly optional, because nothing will get broke if the user for whatever reason downgrades from 19.2 back to 19.1 or earlier. And because everything will be SQLized in the near term we will eventually clean "garbage data".
This approach is also in line with our general policy of not adding mandatory DB migrations for minor releases.

@saubyk
Copy link
Collaborator

saubyk commented Jun 19, 2025

We decided to make this migration truly optional

so, optional as in opt-in or opt-out? discussed with @Roasbeef yesterday and we think this should be an opt-in migration.

@guggero
Copy link
Collaborator

guggero commented Jun 19, 2025

We decided to make this migration truly optional

so, optional as in opt-in or opt-out? discuss with @Roasbeef yesterday and we think this should be an opt-in migration.

Currently it's opt-out, but we can change that if we feel most users won't want to run it.

@ziggie1984
Copy link
Collaborator Author

ziggie1984 commented Jun 19, 2025

I would say we are ok with selecting the opt-out method because with the new approach the migration is not mandatory in a sense that you cannot revert to a previous version. Therefore if something happens with 19.2 the user can just go back to 19.1. As mentioned above the only small disadventage this approach has is, that if the user then goes back to 19.2 or later it will not do the optional migration anymore leaving an unused bucket in the storage. Which isn't a problem because it is not much data and will be garbage-collected when we go to native SQL. I would like to run the garbage collection by default because it removes a lot of disk-usage for big nodes and speeds things up for SQL postgres because no index is required anymore.

@Roasbeef
Copy link
Member

Can be rebased now that the dep PR was merged!

@Roasbeef
Copy link
Member

I think compared to other migrations the actually read/write a large amount of keys, this migration should be much faster (at least on bbolt), as bucket deletion just mark a page on disk as free, and doesn't actually delete until one runs a compaction.

Migration34 garbage collects the decayed log database. This commit
only adds the migration code and does not use it.
This commit adds the migration code for the decayedlog db which
is optional and will default to true.
@ziggie1984 ziggie1984 force-pushed the optional-migration branch from e0d95ca to 8208eb6 Compare June 21, 2025 07:20
@Roasbeef Roasbeef merged commit 45c1564 into lightningnetwork:master Jun 23, 2025
38 checks passed
@ziggie1984 ziggie1984 deleted the optional-migration branch June 24, 2025 06:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants