-
Notifications
You must be signed in to change notification settings - Fork 243
Feat: Add support for Restatements on SCD Type 2 models #4814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9e9d48b
to
d817097
Compare
We probably want to mention the nuance about restatements in the docs. If I understand correctly, these are the two options:
I think it should be explicitly documented that partial restatements of specific sections of the table are not supported |
d817097
to
a4a17fa
Compare
sqlmesh/core/snapshot/definition.py
Outdated
# SCD Type 2 validation that end date is the latest interval if it was provided | ||
if not is_preview and self.is_scd_type_2 and self.intervals: | ||
requested_start, requested_end = removal_interval | ||
latest_end = max(interval[1] for interval in self.intervals) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intervals are sorted already, so you can just do self.intervals[-1][1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you! changed it forgot they were sorted
sqlmesh/core/snapshot/definition.py
Outdated
|
||
get_console().log_warning( | ||
f"SCD Type 2 model '{self.model.name}' does not support end date in restatements.\n" | ||
f"Requested end date [{to_ts(requested_end)}] doesn't match latest interval end date [{to_ts(latest_end)}].\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It warns about the problem but it doesn't say anything about what will happen. I suggest following the pattern of a similar warning above:
Expanding the requested restatement intervals from...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a similar message to say it will be set to the latest interval end
sqlmesh/core/snapshot/definition.py
Outdated
if not is_preview and self.is_scd_type_2 and self.intervals: | ||
requested_start, requested_end = removal_interval | ||
latest_end = max(interval[1] for interval in self.intervals) | ||
if requested_end != latest_end: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about request_end
being greater than latest_end
? Isn't that fine?
sqlmesh/core/engine_adapter/base.py
Outdated
if truncate | ||
else existing_rows_query.where( | ||
exp.and_( | ||
valid_from_col <= cleanup_ts, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed? The other condition is that valid_to_col < cleanup_ts
which means valid_from_col
would also be <= cleanupo_ts
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it is indeed redundant, removed this and kept only the valid_to condition
a4a17fa
to
3dd7193
Compare
3dd7193
to
1ca3bf1
Compare
This update aims to enable restatements from a specific point for SCD Type 2 (previously only supported full restatements), closes #3642
The logic is updated for the latest and static parts so that it enables restatements when a custom start date is used, while for the end date defaults to the latest interval.