-
Notifications
You must be signed in to change notification settings - Fork 3.6k
HHH-19559 add hibernate.multi_tenant.set_schema #10372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* | ||
* @since 7.1 | ||
*/ | ||
String SET_TENANT_SCHEMA = "hibernate.multi_tenant.set_schema"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't love the naming here at all; unfortunately the existing properties have irregular naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that was an unfortunate reality of a singular AvailableSettings with a zagillabillion settings. Related things weren't always "geographically" close to one another and naming tended to get disjointed. It was the main reason I started breaking them out.
But we can always "shadow deprecate" them - aka create new ones that replace the old while still recognizing the old. See DeprecationLogger#deprecatedSetting(java.lang.String, java.lang.String)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we can always "shadow deprecate" them - aka create new ones that replace the old while still recognizing the old.
Yeah, that would be reasonable, I think.
|
||
try { | ||
session.afterObtainConnection( connection ); | ||
} | ||
catch (SQLException e) { | ||
try { | ||
releaseConnection( connection ); | ||
} | ||
catch (SQLException re) { | ||
e.addSuppressed( re ); | ||
} | ||
throw e; | ||
} | ||
return connection; | ||
} | ||
|
||
@Override | ||
public void releaseConnection(Connection connection) throws SQLException { | ||
try { | ||
session.beforeReleaseConnection( connection ); | ||
} | ||
catch (SQLException e) { | ||
log.warn( "Error before releasing JDBC connection", e ); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sebersole I'm anticipating that you might object to what I'm doing here. If so, I need a better suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the idea is that a user would not need MultiTenantConnectionProvider right? They could just do the normal ConnectionProvider set up and, in conjunction with CurrentTenantIdentifierResolver, we'd just alter the schema/catalog used by the Session - right? Just want to make sure I'm understanding the intent here.
Assuming iiuc couldn't we just imply this? It would be better if there were an actual flag saying that the user wants schema/catalog tenancy. E.g.
hibernate.connection...
hibernate.tenancy.strategy=schema
hibernate.tenant_identifier_resolver=my.ResolverImpl
And that this would just be enough to infer what to do.
Have you looked at incorporating this into the LogicalConnection for the Session? Naively perhaps, that would seem like a better place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There used to be a way that users specified the type of tenancy, but I think that got dropped. In a way, this PR adds that distinction back.
Another option would be to infer this when there is a ConnnectionProvider (and NOT a MultiTenantConnectionProvider) with a configured CurrentTenantIdentifierResolver. What I mean is that this may actually be enough:
hibernate.connection...
hibernate.tenant_identifier_resolver=my.ResolverImpl
Trying to get away from the need for a setting here. Or is there some nuance I'm missing that makes you think we need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the idea is that a user would not need MultiTenantConnectionProvider right?
Yes.
They could just do the normal
ConnectionProvider
set up and, in conjunction withCurrentTenantIdentifierResolver
, we'd just alter the schema/catalog used by the Session - right?
Yes.
It would be better if there were an actual flag saying that the user wants schema/catalog tenancy.
That's ... pretty much what I have implemented here, I believe. I just called the property something different.
Have you looked at incorporating this into the
LogicalConnection
for the Session? Naively perhaps, that would seem like a better place.
I have not. Will take a look at that.
OTOH, I think I would prefer this not be too deep down in the layers. It's a fairly high-level construct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTOH, I think I would prefer this not be too deep down in the layers. It's a fairly high-level construct.
I'd argue that this is far deeper than LogicalConnection, which is one of the reasons I mentioned it. Plus LogicalConnection is the "eye of the needle" (in the good way) for Connection-related access for the Session.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a defined tenant id,
You mean the resolver? Because I don't know that asking the resolver, if one, early is always feasible. Not saying it isn't; just saying I'm not sure it's not ;)
Ah, well, no, I meant even if there was one explicitly set on the session.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But given the 2 explicit options I'd defiitely prefer
hibernate.tenancy.style=schema
overhibernate.multi_tenant.set_schema=true
.
Wouldn't it be a reasonable use case to use both datasource-based and schema-based multitenancy?
E.g. we could have:
tenant1.1
=> datasource 1, schema1
tenant1.2
=> datasource 1, schema2
tenant2.1
=> datasource 2, schema1
tenant2.2
=> datasource 2, schema2
Thinking about it, we could even throw discriminator-based multitenancy into the mix, if some tenants share the same datasource + schema...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be a reasonable use case to use both datasource-based and schema-based multitenancy?
Well this is already easily possible by just implementing your own MultiTenantConnectionProvider
.
For such complicated scenarios, I think it's totally reasonable that you have to write a bit of code.
The problem I'm trying to solve here is to have a way to do schema-based multitenancy with any regular-ol' ConnectionProvider
, without having to provide a MultiTenantConnectionProvider
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was that if you introduce hibernate.tenancy.style=schema
, you'll need to take into account that there's a custom
style or whatever you want to call it, that goes beyond just datasource
/schema
/discriminator
. Or that not setting a style is special case. Or that the datasource
style actually encompasses all others. Or...
I get you're trying to make simple things simpler, I'd be wary about making complex things more complex or more confusing, because that looks like a very real risk here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was that if you introduce hibernate.tenancy.style=schema, you'll need to take into account that there's a custom style or whatever you want to call it, that goes beyond just datasource/schema/discriminator
I think this is a good point, and goes to what I suggested with the 3 situations. Phrased in the terms here we have:
- column-based (
@TenantId
) - explicit
MultiTenantConnectionProvider
-based ConnectionProvider
+ "adjustment" of the Connection for the Session
I get you're trying to make simple things simpler, I'd be wary about making complex things more complex or more confusing, because that looks like a very real risk here.
I'm not following this... What is the complex case that you think is getting more complex here?
22c10f6
to
bdea4cf
Compare
hibernate-core/src/main/java/org/hibernate/internal/AbstractSharedSessionContract.java
Fixed
Show fixed
Hide fixed
/** | ||
* The name of the database schema for data belonging to the tenant with the | ||
* given identifier. | ||
* <p> | ||
* Called when {@value org.hibernate.cfg.MultiTenancySettings#SET_TENANT_SCHEMA} | ||
* is enabled. | ||
* | ||
* @param tenantIdentifier The tenant identifier | ||
* @return The name of the database schema belonging to that tenant | ||
* | ||
* @see org.hibernate.cfg.MultiTenancySettings#SET_TENANT_SCHEMA | ||
*/ | ||
default @NonNull String schemaName(@NonNull T tenantIdentifier) { | ||
return tenantIdentifier.toString(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels... out of place? I'm always a bit weirded out by "optionally relevant" methods like this.
Beyond feelings, when I think about it... My main argument in this case would be that the tenant identifier can be provided explicitly when opening a session, so I'm not sure a CurrentTenantIdentifierResolver
would necessarily always make sense in an application. In which case you'd have no way of leveraging this feature, except perhaps implementing all the other methods by throwing exceptions.
Can't we instead have an interface to turn an identifier name to an schema name, and ask people to have hibernate.multi_tenant.schema_resolver
(or whatever) to an implementation? We can totally have a simple
implementation that just takes the tenant identifier as the schema name.
That way, I think the "schema setting" part could be implemented similarly to discriminator-based multitenancy. Connections would be irrelevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way... I get the feeling this could be a much larger concept. Like, an interface that resolves a tenant ID into a "datasource ID" (passed to connection resolvers) and schema name and discriminator (for discriminator-based multitenancy, so that discriminators in the DB aren't necessarily exactly the tenant ID).
Not suggesting to do this here, but it does feel... clean? Maybe I'm just being weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels... out of place? I'm always a bit weirded out by "optionally relevant" methods like this.
Well that's what I meant when I said it's not completely conceptually clean.
But it's sure as hell convenient for the user. If I'm a user of this stuff I would much prefer to have all multitenancy-related shit located in just one class.
Can't we instead have an interface to turn an identifier name to an schema name, and ask people to have
hibernate.multi_tenant.schema_resolver
(or whatever) to an implementation? We can totally have a simple implementation that just takes the tenant identifier as the schema name.
I mean, sure, of course we can.
But it's just such a god damn clusterfuck as far as usability goes. How many properties are you going to have to set, and how many interfaces are you having to implement, to get this one trivial feature working?
Like, an interface that resolves a tenant ID into a "datasource ID" (passed to connection resolvers) and schema name and discriminator (for discriminator-based multitenancy, so that discriminators in the DB aren't necessarily exactly the tenant ID).
I guess I think this would be jumping the shark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like, an interface that resolves a tenant ID into a "datasource ID" (passed to connection resolvers) and schema name and discriminator (for discriminator-based multitenancy, so that discriminators in the DB aren't necessarily exactly the tenant ID).
I guess I think this would be jumping the shark.
It's not that different from naming strategies, but sure, it's complex.
Anyway, my point was: you want to merge two responsibilities in the same interface to minimize complexity, but:
- These particular two responsibilities can be used independently (one without the other), so this creates complexity for those needing only one. Which I guess is fine if we consider people generally need both.
- The derivation of a schema name from a tenant ID is itself a responsibility that could possibly have close siblings, such as deriving a discriminator from a tenant ID.
I would add that finding the current tenant ID generally has to do with the higher layers of your application (e.g. inspecting user connection info), while turning a tenant ID to a schema name is clearly about the lower layers (DBA stuff). So it's really a bit weird to mix those.
If I had to start from scratch with a single entry point for the whole thing, with some cohesion, I'd expose something like this:
interface MultiTenancyStrategy {
// Optionals could just be nullable. Whatever.
Optional<MultiTenantConnectionProvider> createConnectionProvider( .... );
Optional<String> tenantIdToSchemaName(String tenantId);
// Could be a type parameter instead of Object, but IMO if we don't check consistency on startup, that's hiding problems under the carpet.
Object tenantIdToDiscriminator(String tenantId);
String discriminatorToTenantId(Object tenantId);
}
EDIT: With default methods, obviously.
Then you can have a default "discriminator" strategy that just takes the tenant ID as a discriminator, an opt-in (but built-in) "schema" strategy that takes the tenant ID as schema name, and custom strategies from everything else. And the entry point is clear and exposes everything you need to know.
That being said... I'm certainly not blocking you from doing that. I just disagree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yrodiere Note that I'm not against having all these three things as (almost) completely independent features, in the sense that the user is allowed to turn each on and off independently. That's more-or-less what I've implemented. Though it's not completely realized because I didn't add support for setSchema()
to ContextualJdbcConnectionAccess
. Or rather: I added it then removed it.
But Steve on the other hand is pushing more in the direction of making these things mutually-exclusive, more like the model which existed implicitly on the old enum.
I'm fine either way; I just wouldn't add "can use them all together" as a requirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But Steve on the other hand is pushing more in the direction of making these things mutually-exclusive, more like the model which existed implicitly on the old enum.
Okay, well I certainly don't agree with that. IMO that's an artificial limitation that will go against very real use cases. Having some tenants with a dedicated database, and others with a shared one but separate schema, is really not that much of a stretch.
So users with this use case will have to ignore the feature you're adding and do everything in the connection provider.
And frameworks like Quarkus, where the connection provider needs to be implemented by Quarkus itself due to integrations with e.g. Agroal, will be implementing everything all over again in their own code.
It would be a real waste of effort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yoann disagrees with me?!? No way ;)
The mapping of identifier in the app to tenant-name in the database can actually be a shared contract. We could even apply it in the @TenantId
case. And can be as simple as a conversion:
interface TenantNameMapper {
/**
* Maps the name of the tenant used in the application (aka, CurrentTenantIdentifierResolver#resolveCurrentTenantIdentifier) to the corresponding database name which might be a schema name or
String toDatabaseName(Object identifier);
Object toIdentifier(String databaseName);
}
TenantNameMapper#toDatabaseName
allows taking, e.g., CurrentTenantIdentifierResolver#resolveCurrentTenantIdentifier
and mapping it to a value that depends a bit on the overall strategy:
- For column-based, this would be the value stored in the column
- For
MultiTenantConnectionProvider
, this would become the valiue passed to its#getConnection
method - Here it would be the name of the schema we switch to.
#toIdentifier
is obviously the inverse of this. And maybe not even needed depending
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I don't think I am "pushing more in the direction of making these things mutually-exclusive" so much as trying to recognize the bits that are reusable across the different situations and designing the SPI around that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yoann disagrees with me?!? No way ;)
What can I say, I'm a sucker for traditions! And we both like this one ;)
interface TenantNameMapper {
This makes a lot of sense to me, and is more or less in line with my initial proposal:
Can't we instead have an interface to turn an identifier name to an schema name, and ask people to have
hibernate.multi_tenant.schema_resolver
(or whatever) to an implementation? We can totally have asimple
implementation that just takes the tenant identifier as the schema name.
... but (to keep the tradition alive):
-
toIdentifier
is pointless in the schema case (I think?) -
Gavin doesn't like this. Quoting:
I mean, sure, of course we can.
But it's just such a god damn clusterfuck as far as usability goes. How many properties are you going to have to set, and how many interfaces are you having to implement, to get this one trivial feature working?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toIdentifier is pointless in the schema case (I think?)
Yep, as I said:
#toIdentifier is obviously the inverse of this. And maybe not even needed depending
However, (1) it does not hurt and (2) I'd much prefer a single way to do that across all "strategies".
Gavin doesn't like this. Quoting:
In the normal case you don't even implement/provide this.
@@ -697,6 +699,32 @@ public JdbcConnectionAccess getJdbcConnectionAccess() { | |||
return jdbcConnectionAccess; | |||
} | |||
|
|||
private boolean useSchemaBasedMultiTenancy() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we are again :)
if ( !factoryOptions.isMultiTenancyEnabled() ) { | ||
// we might still be using schema-based multitenancy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, what?
This might warrant a renaming of isMultiTenancyEnabled()
? Perhaps isMultiTenancyEnabledOrNotIDontKnowAskSomeoneElse()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yoann, that's the whole point of this.
The design that we inherited here was that "Mutitenancy" = you have a MultiTenantConnectionProvider
. And that was just wrong, and I guess we need to get away from it.
The problem is that Hibernate has support for all sorts of ConnectionProvider
s, and you completely lose all of that as soon as you need a MultiTenantConnectionProvider
. The only thing we sorta give you is support for the case of a JNDI-bound DataSource
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the old MultiTenancyStrategy
enum we used to have, SCHEMA
and DATABASE
were synonyms and both meant "use a custom MultiTenantConnectionProvider
". (And DISCRIMINATOR
was simply unimplemented.) Here I'm introducing a totally new thing that we've never had before: a way to do schema-based multitenancy without a MultiTenantConnectionProvider
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know. Sorry, perhaps not the best time for humor.
More seriously, renaming isMultiTenancyEnabled()
to isMultiTenantConnectionProviderEnabled
would probably be clearer than adding comments like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More seriously, renaming
isMultiTenancyEnabled()
toisMultiTenantConnectionProviderEnabled
would probably be clearer than adding comments like this?
Sure, but that method's defined by an SPI so it would have to be done via the deprecation process. And note that isMultiTenancyEnabled()
is already clearly documented to be about the presence of a MultiTenantConnectionProvider
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The design that we inherited here was that "Mutitenancy" = you have a MultiTenantConnectionProvider. And that was just wrong, and I guess we need to get away from it.
I disagree with this. And you even disagreed with it above.
There are a few situations here -
- You have different "Connection pools" based on tenant - this is the purpose of MultiTenantConnectionProvider
- You use column-based tenancy
- This new hybrid you are implementing which uses plain-old ConnectionProvider and simply switches the schema/catalog of the Connection based on the tenant-id associated with the Session
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sebersole The specific method isMultiTenancyEnabled()
always meant that there was a MultiTenantConnectionProvider
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure where I said otherwise, but yep.
|
||
try { | ||
session.afterObtainConnection( connection ); | ||
} | ||
catch (SQLException e) { | ||
try { | ||
releaseConnection( connection ); | ||
} | ||
catch (SQLException re) { | ||
e.addSuppressed( re ); | ||
} | ||
throw e; | ||
} | ||
return connection; | ||
} | ||
|
||
@Override | ||
public void releaseConnection(Connection connection) throws SQLException { | ||
try { | ||
session.beforeReleaseConnection( connection ); | ||
} | ||
catch (SQLException e) { | ||
log.warn( "Error before releasing JDBC connection", e ); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But given the 2 explicit options I'd defiitely prefer
hibernate.tenancy.style=schema
overhibernate.multi_tenant.set_schema=true
.
Wouldn't it be a reasonable use case to use both datasource-based and schema-based multitenancy?
E.g. we could have:
tenant1.1
=> datasource 1, schema1
tenant1.2
=> datasource 1, schema2
tenant2.1
=> datasource 2, schema1
tenant2.2
=> datasource 2, schema2
Thinking about it, we could even throw discriminator-based multitenancy into the mix, if some tenants share the same datasource + schema...
This is an alternative solution to the problem of schema-based multi-tenancy.
It introduces:
hibernate.multi_tenant.set_schema
, andCurrentTenantIdentifierResolver.schemaName(tenantIdentifier)
.I need a review here @sebersole.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license
and can be relicensed under the terms of the LGPL v2.1 license in the future at the maintainers' discretion.
For more information on licensing, please check here.
https://hibernate.atlassian.net/browse/HHH-19559