-
Notifications
You must be signed in to change notification settings - Fork 0
Replacing DB names with compilation DBs before network call #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 1.5.latest-fork
Are you sure you want to change the base?
Replacing DB names with compilation DBs before network call #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mainly questions
@@ -45,6 +45,7 @@ | |||
from dbt.events.types import AdapterEventWarning | |||
from dbt.ui import line_wrap_message, warning_tag | |||
|
|||
from dbt.adapters.snowflake.compilation_db_replacer import CompilationDBReplacer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we leave it here, we should probably add a comment that its imported from a different project (however I think that even copying it to be here would be better than this weird import cycle)
maybe we can put the compilation_db_replacer
in fd_common
and then we can just add it as a regular project dependency
and in order to pass the correct db names around we can use a file or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it cause cyclic dependency between packages? Python will compile since it is not cyclic dependency between python files, but it will cause future architectural nightmare. Anyway, ildis suggested to create a new dependency package that will hold compilation_db_replacer.py only.
database=creds.database, | ||
database=compilation_db, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool!
if "already exists" in repr(e): | ||
connection, cursor = self._add_begin_commit_only_queries( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when does this happen and why is the fix to start a new transaction? why did the replacement of database name cause this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why some queries run twice now, it is a bit worrying, so this happens when you try to create seed table twice.
'commit' is just a random query that always succeed and allows me to get a cursor, but you are right, it might have adverse effects.
results = self.execute_macro(LIST_SCHEMAS_MACRO_NAME, kwargs={"database": database}) | ||
compilation_db = CompilationDBReplacer.customer_db_to_compilation_db(database) | ||
results = self.execute_macro(LIST_SCHEMAS_MACRO_NAME, kwargs={"database": compilation_db}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm the name of the database in the request should already be replaced by the code you added in the snowflake adapter no..? why do you need to replace it here too? (same question for all changes in this file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these functions are called directly from 'dbt' package, in addition to calling functions in connections.py
dbt/adapters/snowflake/impl.py
Outdated
@@ -254,3 +260,33 @@ def submit_python_job(self, parsed_model: dict, compiled_code: str): | |||
|
|||
def valid_incremental_strategies(self): | |||
return ["append", "merge", "delete+insert"] | |||
|
|||
def drop_relation(self, relation): | |||
with RelationDBSubstitution(relation): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this implemented with a context manager that mutates the relation, and not just do copy(relation)
and replace the relevant field? so it wont need to deal with cleanup at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It at least needs to be a deep copy since I am substituting second level field (relation.path.database). So who knows which complexities this will cause. My solution is simple and works. Do you expect issues with it? Multi-tasking can be an issue in theory, but in practice the class uses a single snowflake adapter, which is not thread safe.
b72b5a1
to
e0b87ac
Compare
e0b87ac
to
a96b443
Compare
No description provided.