-
Notifications
You must be signed in to change notification settings - Fork 26
DB Migrations #221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DB Migrations #221
Conversation
Seems like only commits after 6e6169f need to be reviewed in here. |
@@ -0,0 +1,58 @@ | |||
package main |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, +325 LOC
for not a real migration to add 2 columns to 1 table? That seems a LOT.... sorry if I have misunderstood the purpose of this, please, do not hesitate to clarify that here.
Do you think there might be a way to avoid having this complexity, but rather create a separate issue for applying a real migration tools instead?
As discussed IRL (I'm sorry if that was not articulated well enough) I think it's nice to have a DML SQL command in a small .sql file, to add a new columns. From my side, having a custom mechanism implemented to handle this task is 🙅♂️, as at least for now it seems like a lot of overhead (that needs to be maintained) to have that custom code that needs to be go run
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two scripts.
First one adds the new columns. It's 58 cols because of having a command tu run the query. Most of the lines are command documentation.
Second one is 176 lines to assign uast to existent FilePairs. It is needed because I couldn't guarantee that provided table with UAST is the same that current one so it could be replaced; even more: I could guarantee that it is different, with different Id's and less cols. Transforming the provided one into the current one may require more work, and it is would not be reusable.
Because of that the command just look for Files without UAST, and assign them looking for them in the source table.
About having separated commands instead of a bunch of queies: it seems easier to test and reproduce the process making shorter the time needed for the migration itself.
The commands itself can be reused for creating more commands and being consistent between them. I copied the idea from Borges that provided a clan way to pack it's commands.
About the maintenance overhead, I can not see how since it won't be needed (same way that for other cli scripts in CAT, that are not supported nor prod ready)
If you really concern about having unmaintained code in the repo we can delete all these scripts tomorrow or as soon as we finish this task. Anyway I think it's better to validate that the modifications to be done are what we expect.
About creating an issue to add a migration tool, I agree on that, but as we agreed, it is something that should be maybe done in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 For really nice PR description, with detailed test plan, etc.
If I did not misunderstood the purpose of this in #221 (review) - for this change I would strongly prefer having a single DML statement in single .sql
file instead of a custom code.
Sorry I don't get why do we need all this code for 1-time task and especially why do we need it in master. PS. It looks like it should be possible to achieve just by this (pseudo code):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't see any obvious problems in the code.
Taking into account this is a one-off internal script that won't be supported, I won't comment on this being or not overkill.
Thanks @smacker for your suggestion On the other hand, if we agree that non production code should not be in To be consistent, we could also schedule that task to be done separatelly. |
dcd750d
to
25d847e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think this approach is stretching the limits a bit and politely disagree with argument that having multiple commands is not bringing maintenance overhead, but I commit to the solution @dpordomingo, as an author of this PR and a maintainer of this project suggests.
LGTM
The main goal of this PR was to ensure that there was no error in the migration process. |
06bc285
to
fd6524b
Compare
@dpordomingo applying migration locally without merging to master would be great! Thanks! |
b487c67
to
9d59aff
Compare
1913173
to
f113671
Compare
@dpordomingo is there anything left before merging this? |
I understood #221 (comment) that we won't merge it due to your concerns explained in the discussion above. |
Migrations were successfully applied and the new DB was pushed into Prod and Staging. The last migration will be applied once the ML Feature Extraction API has been integrated into CAT:
|
depends on
#206and #232required by #197 and #233
This PR defines the migration from our current Staging DB, with no
uast
cols, to a new internal DB havinguast
nullable cols in thefile_pairs
table as required by #206This PR also removes the
diff
column from the internal DB as required by #232This PR also removes the
features
table from the internal DB as required by #233How to use it:
internal.db
) → can be downloaded from the export toolsource.db
) → from ML issue, stored in gdrivefiles
tableHow to validate the result
with this command/query you can fetch those FilePairs whose UASTs could not be imported
if you made a backup of the file_pairs table, you can ensure that primary keys were kept running:
The returned value will be the amount of PK that were messed (it should be
0
)Disclaimer
These are not real migrations with up/down feature
You can read more details about these scripts in the
migrations/README.md