Add ability to lint based on word boundaries#1461
Merged
Conversation
11e52f3 to
cacc888
Compare
JordonPhillips
approved these changes
Oct 24, 2022
sugmanue
reviewed
Oct 24, 2022
smithy-linters/src/main/java/software/amazon/smithy/linters/WordBoundaryMatcher.java
Outdated
Show resolved
Hide resolved
This commit introduces a new syntax for matching words with the ReservedWords linter and is intended to be used with the upcoming sensitive words linter defined in #1364. In addition to supporting wildcard searches ("*" prefix, suffix, and contains), we now support matching based on word boundaries. This commit introduces the "terms" keyword for word boundary searches and adds dedicated abstractions for word boundary and wildcard matching. For example, "access key id" will match "AccessKeyId", "access_key_id", "accessKeyID", "access_key_id100", "AccesKeyIDValue". It will also match when all the words are concatenated together: "accesskeyid". However, it will not match "accesskey_id" because it only has two word boundaries ("accesskey" and "id").
cacc888 to
b25767a
Compare
sugmanue
reviewed
Oct 24, 2022
| return result.toString(); | ||
| } | ||
|
|
||
| private static void addLowerCaseStringToBuilder(StringBuilder result, String str, int start, int count) { |
Contributor
There was a problem hiding this comment.
Small nit, I think that we can simplify the callers and a bit the function if we take the endIndex instead of the count.
Member
Author
There was a problem hiding this comment.
I was mimicking what java.lang.String's constructor does here:
public String(char value[], int offset, int count) {
It's internal to the class, so probably not worth changing IMO
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add ability to lint based on word boundaries
This commit introduces a new syntax for matching words with the
ReservedWords linter and is intended to be used with the upcoming
sensitive words linter defined in #1364.
In addition to supporting wildcard searches ("*" prefix, suffix,
and contains), we now support matching based on word boundaries.
This commit introduces the "terms" keyword for word boundary
searches and adds dedicated abstractions for word boundary and
wildcard matching.
For example, "access key id" will match "AccessKeyId",
"access_key_id", "accessKeyID", "access_key_id100", "AccesKeyIDValue".
It will also match when all the words are concatenated together:
"accesskeyid". However, it will not match "accesskey_id" because it
only has two word boundaries ("accesskey" and "id").