Skip to content

feat: add option to enforce unique attribute #49

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

hbagdi
Copy link
Contributor

@hbagdi hbagdi commented Oct 28, 2018

Enforcing Uniqueness on an Index can go a long way since higher-level
application code doesn't need to add explicit checks for it.

To keep the change backwards compatible, changes have been made to
Insert() method to respect Unique constraints.

Insert will not allow update if EnforceUnique is set to true on the id
schema.
If any other schema of the table has EnforceUnique, updates using
Insert() will error out if the same value exists in the schema.

@jefferai
Copy link
Member

@vishalnayak almost certainly of interest to you

@vishalnayak
Copy link
Contributor

vishalnayak commented Oct 29, 2018

This is eliminating the need of higher level application code to check that the field value being inserted is not already being used in another object that is already inserted, which seems to be nice. However, this PR is also overloading another meaning to EnforceUnique, which is that a value can only be inserted once and that an update to the same field errors out even if the value being inserted is "unique". I am not sure if that is the right way to go.

@vishalnayak
Copy link
Contributor

I think that updates should be allowed as long as the value is not used elsewhere. Currently, "other" fields in the object cannot be updated because the uniqueness constraint on one field disallows updating the object altogether.

@hbagdi
Copy link
Contributor Author

hbagdi commented Oct 29, 2018

I think that updates should be allowed as long as the value is not used elsewhere. Currently, "other" fields in the object cannot be updated because the uniqueness constraint on one field disallows updating the object altogether.

@vishalnayak I expected this to come up and when I initially took a stab, updates seemed complicated to resolve.

I had introduced an Update() method in a previous incarnation of this commit, which took care of this case.

Giving a second look at the code, it seems that updates could be allowed and be a little more smarter, but might be a little expensive as well.

An unrelated question, not sure who can answer @jefferai or @vishalnayak,

Can I remove EnforceUnique, and overload the meaning of Unique? This could be a breaking change, hence the concern.

@vishalnayak
Copy link
Contributor

@hbagdi I do strongly feel that update is a very common use case for an object (at least Vault uses memdb in such a manner). I also wonder about the double iteration of tableSchema.Indexes and if it can be avoided. Definitely curious to know what would be the overhead.

Regarding merging the functionality of EnforceUnique within Unique, you are right in thinking that it would be a backwards incompatible change. It would only complain those users who have used the Unique incorrectly by inserting multiple objects without honoring the Unique setting. But, it can't be known how many projects are using this and in which ways. Being a person who has not maintained this repository before, I can't weigh in strongly. But I am against breaking the existing behavior.

txn.go Outdated
// On an update, there is an existing object with the given
// primary ID. We do the update by deleting the current object
// and inserting the new object.
if update && idSchema.Unique && idSchema.EnforceUnique {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not certain at this point if this PR will get merged or not. In any case, this check should certainly go away.

@hbagdi
Copy link
Contributor Author

hbagdi commented Nov 17, 2018

The problem here is if you allow updates using Insert() method, then it leads to a confusing API.

Currently, Insert() can insert or update an entity.

Also, EnforceUnique though not present in the package's master branch, is actually enforced on the idSchema, because duplicates in idSchema are updated/overwritten.

Now, if a user of this application would like to get an error if there is duplicate ID during an insert, then update operations cannot be supported using the same Insert().

I'd happy to add an Update() if that is an acceptable solution.

Let me know your thoughts!

@hbagdi
Copy link
Contributor Author

hbagdi commented Dec 3, 2018

Friendly ping on this to keep moving forward. Thanks!

@hbagdi
Copy link
Contributor Author

hbagdi commented Apr 18, 2019

@jefferai @vishalnayak Would you guys be interesting in moving this forward?

@jefferai
Copy link
Member

jefferai commented May 8, 2019

@hbagdi We are, but please address the issue Vishal left -- we should explicitly not allow EnforceUnique on the ID. That way we can know if it's an insert or an update via the current API.

This check can be done as part of the schema validation.

Also probably the error should be a defined value for an enforce uniqueness violation to help the application understand what the issue is.

@hbagdi
Copy link
Contributor Author

hbagdi commented May 13, 2019

@hbagdi We are, but please address the issue Vishal left -- we should explicitly not allow EnforceUnique on the ID. That way we can know if it's an insert or an update via the current API.

For my use-case, I specifically want to throw an error if when an Insert() is actually performing an Update and not an Insert. If we don't actually EnforceUnique on ID, then it is not possible to support this specific use-case. 😕
Anything we can do to solve both of our use-cases?

@jefferai
Copy link
Member

Can you describe your actual use-case then? Is it purely to have a knob to never allow updates at all? Because that probably makes more sense at a table level.

@hbagdi
Copy link
Contributor Author

hbagdi commented May 14, 2019

Can you describe your actual use-case then? Is it purely to have a knob to never allow updates at all? Because that probably makes more sense at a table level.

I'm using memdb to store a cache of SQL table and then performing CRUD operations on it.
The use case does need to update the table but update and insert have different semantics.
It is needed that memdb throws an error if we try to insert a duplicate entry into the table, which is currently treated as an update or rather upsert.

Is it possible to release a v1 of this package and then add these APIs in the v2 version of the library, such that we keep the backwards compat by default (and give users to change the behavior as an opt-in, with EnforceUnique on ID)?

@jefferai
Copy link
Member

The use case does need to update the table but update
It is needed that memdb throws an error if we try to insert a duplicate entry into the table, which is currently treated as an update or rather upsert.

But how do you propose to actually do this? Those seem fundamentally at odds with the current proposal. If you want to update you need to call Insert with an ID so that we know what to update. If EnforceUnique is true then this will error, meaning you can't ever actually update.

If there was an Update call coupled with a table option to disallow updates via insert, then that would probably do what you want. But it doesn't really make sense to disallow updates per index -- objects are put into tables, indices are just built on top of the objects.

@hbagdi
Copy link
Contributor Author

hbagdi commented May 14, 2019

Indeed. In a very early incarnation of this PR, I had introduced an Update() function which was the path to be taken for updates and the Insert() would fail if it was being used for Update().

But it doesn't really make sense to disallow updates per index -- objects are put into tables, indices are just built on top of the objects.

Correct. We will need to use the index to check for duplicates though.

@hbagdi
Copy link
Contributor Author

hbagdi commented Jul 10, 2019

Friendly ping.

@jefferai
Copy link
Member

Hi there! What do you need on our end? It sounded like we were in agreement on the way forward but there have been no updates to the code.

@hbagdi
Copy link
Contributor Author

hbagdi commented Sep 21, 2019

Apologies for dropping the balls on this one, it has been a busy year.

If there was an Update call coupled with a table option to disallow updates via insert, then that would probably do what you want. But it doesn't really make sense to disallow updates per index -- objects are put into tables, indices are just built on top of the objects.

Is this an acceptable solution to this library?

I think to solve this problem, it is necessary to distinguish between an Insert and Update, otherwise, it is not possible to detect if the caller intended an insert or an update using just a single Insert().

@hbagdi
Copy link
Contributor Author

hbagdi commented Nov 4, 2019

This has stalled and is out of date.
If needed, I'll open up again in future with a better implementation and reasoning for design.

@hbagdi hbagdi closed this Nov 4, 2019
@jefferai
Copy link
Member

jefferai commented Nov 8, 2019

Hi @hbagdi ,

Just to be clear, is it that you don't have time to get this across the finish line? I think we were just waiting on updates from you. It seems like we'd all agreed on the need for Update instead of changing Insert's behavior.

@jefferai
Copy link
Member

jefferai commented Nov 8, 2019

Reopening for tracking.

@jefferai jefferai reopened this Nov 8, 2019
@hbagdi
Copy link
Contributor Author

hbagdi commented Nov 8, 2019

Okay, backwards compatibility remains a challenge but I'll work on it.
I worked on the original Update method more than a year ago but I'm much more familiar with this code-base now. I'll push it during the holidays.

@jefferai
Copy link
Member

jefferai commented Nov 8, 2019

Is backwards compat a challenge with a separate Update method and non-default option to disallow updates via insert?

@hbagdi
Copy link
Contributor Author

hbagdi commented Nov 8, 2019

Is backwards compat a challenge with a separate Update method and non-default option to disallow updates via insert?

No, then that shouldn't be a problem. I now know what to do. I'll ping back if I run into another blocker.

Enforcing Uniqueness on an Index can go a long way since higher-level
application code doesn't need to add explicit checks for it.

To keep the change backwards compatible, changes have been made to
`Insert()` method to respect Unique constraints.

Insert will not allow update if `EnforceUniqueness` is set to
true on the id schema.
If any other schema of the table has EnforceUnique, updates using
`Insert()` will error out if the same value exists in the schema.

A new `Update()` has been added to allow for updates when `EnforceUniqueness`
is set.
@hbagdi hbagdi force-pushed the feat/enforce-unique branch from 45e8b03 to 44c2c01 Compare November 23, 2019 04:00
@hbagdi
Copy link
Contributor Author

hbagdi commented Nov 23, 2019

@jefferai I updated the PR with what we are talking above.
It is not merge-ready (needs more tests) but I can polish it up if this aligns with what you've in mind.

@@ -213,6 +216,15 @@ func (txn *Txn) Insert(table string, obj interface{}) error {
}
}

if indexSchema.EnforceUniqueness && indexSchema.Unique {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this redundant with the check above?

@@ -93,7 +93,9 @@ type IndexSchema struct {
// exist from a structure.
AllowMissing bool

Unique bool
Unique bool
EnforceUniqueness bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the naming of this is a bit confusing, like, why wouldn't the Unique val already require this? Maybe something like DisallowUpdatesViaInsert?

@hashicorp-cla
Copy link

hashicorp-cla commented Mar 12, 2022

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants