Skip to content

Feat: Add support for Restatements on SCD Type 2 models #4814

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 30, 2025

Conversation

themisvaltinos
Copy link
Contributor

@themisvaltinos themisvaltinos commented Jun 25, 2025

This update aims to enable restatements from a specific point for SCD Type 2 (previously only supported full restatements), closes #3642

The logic is updated for the latest and static parts so that it enables restatements when a custom start date is used, while for the end date defaults to the latest interval.

@themisvaltinos themisvaltinos force-pushed the themis/scd2rest branch 4 times, most recently from 9e9d48b to d817097 Compare June 26, 2025 21:31
@themisvaltinos themisvaltinos requested review from eakmanrq and a team June 26, 2025 21:46
@erindru
Copy link
Collaborator

erindru commented Jun 27, 2025

We probably want to mention the nuance about restatements in the docs. If I understand correctly, these are the two options:

  • Full restatement (essentially drop + recreate)
  • Partial restatement where the table is wiped out from a certain date onwards, and you set that date.

I think it should be explicitly documented that partial restatements of specific sections of the table are not supported

# SCD Type 2 validation that end date is the latest interval if it was provided
if not is_preview and self.is_scd_type_2 and self.intervals:
requested_start, requested_end = removal_interval
latest_end = max(interval[1] for interval in self.intervals)
Copy link
Member

@izeigerman izeigerman Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intervals are sorted already, so you can just do self.intervals[-1][1]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you! changed it forgot they were sorted


get_console().log_warning(
f"SCD Type 2 model '{self.model.name}' does not support end date in restatements.\n"
f"Requested end date [{to_ts(requested_end)}] doesn't match latest interval end date [{to_ts(latest_end)}].\n"
Copy link
Member

@izeigerman izeigerman Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It warns about the problem but it doesn't say anything about what will happen. I suggest following the pattern of a similar warning above:

Expanding the requested restatement intervals from...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a similar message to say it will be set to the latest interval end

if not is_preview and self.is_scd_type_2 and self.intervals:
requested_start, requested_end = removal_interval
latest_end = max(interval[1] for interval in self.intervals)
if requested_end != latest_end:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about request_end being greater than latest_end? Isn't that fine?

if truncate
else existing_rows_query.where(
exp.and_(
valid_from_col <= cleanup_ts,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? The other condition is that valid_to_col < cleanup_ts which means valid_from_col would also be <= cleanupo_ts right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it is indeed redundant, removed this and kept only the valid_to condition

@themisvaltinos themisvaltinos merged commit d0f69c3 into main Jun 30, 2025
26 checks passed
@themisvaltinos themisvaltinos deleted the themis/scd2rest branch June 30, 2025 21:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Restatements on SCD Type 2 and Use Them When Creating Preview Clone
4 participants