Change: Replace Peewee with SQLAlchemy/Alembic#1417
Change: Replace Peewee with SQLAlchemy/Alembic#1417arikfr merged 80 commits intogetredash:masterfrom
Conversation
redash/handlers/base.py
Outdated
| return fn(*args, **kwargs) | ||
| except DoesNotExist: | ||
| rv = fn(*args, **kwargs) | ||
| if rv is None: |
There was a problem hiding this comment.
SQLA will never raise an exception for missing rows?
There was a problem hiding this comment.
its .get() and .first() methods return None when there's no entry, so I believe that's correct.
There was a problem hiding this comment.
Maybe we can get rid of this helper, as it seems that Flask-SQLAlchemy has its own: first_or_404 and get_or_404.
There was a problem hiding this comment.
SQLA will never raise an exception for missing rows?
See also .one() and .one_or_none()
| Flask-RESTful==0.3.5 | ||
| Flask-Login==0.3.2 | ||
| Flask-OAuthLib==0.9.2 | ||
| Flask-SQLAlchemy==2.1 |
There was a problem hiding this comment.
In another project I used alchy which also has Flask-Alchy which is a drop-in replacement for it. The benefit is that we won't need Flask session whenever we're using the DB. The downside is that alchy never seemed to gain mind share unlike Flask-SQLAlchemy.
TL;DR: we can keep Flask-SQLA and see if it adds too much boilerplate to the jobs code. If it isn't, keep it. Otherwise consider alchy.
There was a problem hiding this comment.
My suspicion is that it won't add boilerplate, calling create_app() should be most of it.
There was a problem hiding this comment.
Re: alchy, it seems the author's mostly moved onto https://github.com/dgilland/sqlservice#history
| data_source_id = Column(db.Integer, db.ForeignKey("data_sources.id"), nullable=True) | ||
| data_source = db.relationship(DataSource) | ||
| latest_query_data_id = Column(db.Integer, db.ForeignKey("query_results.id"), nullable=True) | ||
| latest_query_data = db.relationship(QueryResult) |
There was a problem hiding this comment.
Why do we need both latest_query_data_id and latest_query_data? (applies to all similar fields) SQLA doesn't have a convenience method to get the object id instead of loading the object itself otherwise?
There was a problem hiding this comment.
SQLA is a bit more explicit about this stuff; the foo_id field is the actual db column, and foo is the attribute for the related ORM object.
jeffwidman
left a comment
There was a problem hiding this comment.
Thanks for working on this! I added a few comments, hopefully they're helpful... didn't have time to review it all.
redash/cli/database.py
Outdated
| from redash.models import db, create_db, init_db | ||
| create_db(True, True) | ||
| init_db() | ||
| db.session.commit() |
There was a problem hiding this comment.
Why do you need the db.session.commit()?
I assume create_db or init_db calls Flask-SQLAlchemy's create_all(), which will implicitly call a commit:
http://stackoverflow.com/questions/34410091/flask-sqlalchemy-how-can-i-call-db-create-all-and-db-drop-all-without-trigg
| class TimestampMixin(object): | ||
| updated_at = Column(db.DateTime(True), default=db.func.now(), | ||
| onupdate=db.func.now(), nullable=False) | ||
| created_at = Column(db.DateTime(True), default=db.func.now(), |
There was a problem hiding this comment.
Maybe this should use server_default?
redash/models.py
Outdated
| @classmethod | ||
| def get_by_id_and_org(cls, object_id, org): | ||
| return cls.get(cls.id == object_id, cls.org == org) | ||
| return cls.query.filter(cls.id == object_id, cls.org == org).first() |
There was a problem hiding this comment.
Will this query ever return more than one result? If not, should probably use one() or one_or_none()
| @classmethod | ||
| def get_by_slug(cls, slug): | ||
| return cls.get(cls.slug == slug) | ||
| return cls.query.filter(cls.slug == slug).first() |
There was a problem hiding this comment.
I suspect this should also be a one() or one_or_none() as I think you want errors if more than one result is returned for a given slug.
| name = Column(db.String(100)) | ||
| permissions = Column(postgresql.ARRAY(db.String(255)), | ||
| default=DEFAULT_PERMISSIONS) | ||
| created_at = Column(db.DateTime(True), default=db.func.now()) |
There was a problem hiding this comment.
Probably want server_default()
There was a problem hiding this comment.
Probably, but I don't want to change the schema until after I get things working as-is. (Hi Jeff! It's been a while! Never expected to see someone from SFSH commenting on my code :-)
|
Fixes #1124 |
|
@washort I finished with my work on the frontend (for now) and want to give you a hand here. I rebased you branch with the latest master & fixed an issue with the settings/DATABASE_URL. Do you have some unpushed work or can I do a force push with these changes? |
|
I updated the tests code and now we get real failures, but still many tests fail just because the database runs out of connections. I tried to compare how we manage the connection/session in Calling engine#dispose (9f43542) seems to fix this, but is it the right usage? Why I haven't seen this in any other example? |
|
Another SQLA question: @classmethod
def get_by_id_and_org(cls, visualization_id, org):
return cls.query.join(Query).filter(cls.id == visualization_id, Query.org == org).one()With peewee I could pass to such method either Any middle ground? |
|
No, you have to match up the right value with the right attribute. This is only really an issue in unit tests though, I think, because as far as I can tell, in the rest of the code you should only be using object ids when they're in query parameters; the rest of the time you can just pass around objects. |
|
Working today on this branch was a reminder why I never liked SQLAlchemy in the first place :-\ It's very powerful, but why the simple stuff are so hard and verbose? |
|
@washort But I had to force push the result over your branch... I hope you didn't have anything uncommitted. |
|
Looks like we're experiencing test failures due to webpack not running in CircleCI. |
I'm not 100% sure about this. How will it work? Any reference implementation/documentation? |
|
All tests pass now (I changed configuration to run Webpack) 💯 But I greped the code for things like |
|
That's why I fixed the tests first -- to see where it'd be profitable to write more tests :-) Re bridge tables - this is what I was looking at: http://docs.sqlalchemy.org/en/latest/_modules/examples/generic_associations/table_per_related.html |
This will work for |
| nullable=False) | ||
|
|
||
|
|
||
| class ChangeTrackingMixin(object): |
There was a problem hiding this comment.
How about implementing this with a an before_update or after_update event?
It seems that SQLA has the tools to determine if something was changed in the event. I tried to experiment with it, but couldn't get the event to trigger. :\
There was a problem hiding this comment.
I found out why the event didn't trigger and implemented it: e8739b3
If you have no comments, I will push this change to your branch.
The part I'm not happy about is how we deduce the user who changed the object, but I'm not sure it's that bad. Eventually this code is only relevant for the API, which is Flask based... and I added some safeguards to make sure it doesn't cause harm outside of Flask context.
There was a problem hiding this comment.
Another issue with the way I set who changed the object is that it can't be changed :-\ Not a huge deal, mainly an issue in tests at the moment but feels wrong.
There was a problem hiding this comment.
I'll have a look at this next. What was the issue with the way it's done now?
There was a problem hiding this comment.
Mainly the fact that you need to call record_changes and the manual "calculation" of what changed.
But apparently doing it in the after_insert/after_update events is wrong (SQLA complains about using Session.add there) and using before_flush also introduces its own challenges.
If I won't find a solution for this today, I will revert back to your version, apply record_changes where needed and revisit this in the future.
There was a problem hiding this comment.
Ah. Yeah I didn't want to try to get too magic at this point, might be interesting to investigate later.
There was a problem hiding this comment.
I always try to maintain balance between "magic" and "hassle" :-) At first it seemed like a good balance point here, but as this starts to become too complex, I think I will revert to the explicit version you had.
Otherwise we were running out of connections.
|
@washort I did some updates to the CLI tests:
There are still some (4?) tests failing because the CLI creates its own app_context/db session. I'm not sure how to solve this :-( One option is to create our own |
Also moved old migrations to old_migrations folder (before deleting them entirely).
|
Added Alembic (with Flask-Migrate). Updated the tasks to reflect integration status. |
| from redash.utils.configuration import ConfigurationContainer | ||
|
|
||
| manager = click.Group(help="Data sources management commands.") | ||
| manager = AppGroup(help="Data sources management commands.") |
|
Aside from more fixes to broken functionality (if there is anything left) the only thing I want to add in this branch before merging is "Replace MeteredModel with SQLAlchemy timing events". All the rest can be a follow up IMO. |
|
Never mind, I was looking at |
|
Let's keep the functionality of overall timing per request and # of queries
executed. Until now this is the only infrormation I actuallly used.
…On Fri, 9 Dec 2016 at 20:52 Allen Short ***@***.***> wrote:
Had a look at the query-timing stuff available in SQLAlchemy - the main
difficulty with replicating the current behavior is that by the time
queries are executed, the only information available is the query text and
its parameters; model class, method name etc. aren't accessible.
Any thoughts on how you want to handle this? Obviously we can parse the
SQL to retrieve table names and operation type, if that's the path you
prefer.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1417 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEXLHmXF6vHBUjYXykdGYgYyNoXGASCks5rGaNzgaJpZM4K4WHX>
.
|
|
In that case I think we're done. I still want to change the schema a bit but that can happen in a new branch. |
|
I've added some more metrics and .... it's merged! :) |
|
Major congrats you two, this was a lot of work! PS: @washort indeed surprised to see you on here. Hope you and fam are well. This is kinda like the inverse of when a IRL coworker told me he was googling something and found what he needed on SO, then realized I'd written the answer. @arikfr SFSH is a mailing list of a loosely affiliated group of folks, many (but not all) of whom attend gracepres.com. |
is_draftone)old_migrations(might just delete it)alembic stamp headwhen creating the DB at the first time.MeteredModelwith SQLAlchemy timing events