Skip to content

Auto-Capitalize First Word #880

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Apr 15, 2024

Conversation

emilyychenn
Copy link
Contributor

@emilyychenn emilyychenn commented Apr 4, 2024

Bug/issue #: rdar://122167705

Summary

When writing a doc comment, it feels intuitive to not capitalise the first word of its description because of the nature of the doc comment.

For example: adding a new parameter would be written as:
/// - Parameter testParam: this parameter is just a test parameter to show what this looks like in lowercase.
In this example, the sentence in doc comments is natural to write with a lowercase starting word, because the first word (Parameter) is already capitalized. See the screenshot below for what this looks like in the built documentation.

Screenshot 2024-04-04 at 18 13 29

This PR auto-capitalizes the first word of a new section or aside. Note that this auto-capitalization only occurs if the first word is all lowercase and contains only characters A-Z, or if the first word contains CharacterSet punctuation characters (e.g. a period, comma, hyphen, semi-colon, colon).

Dependencies

N/A

Testing

Add documentation via doc comments with anything after the colon (e.g. a parameter, aside, warning, important, todo, note, throws, returns) written in lowercase, then compile the documentation. The rendered documentation should now include an uppercased first character if the first word in the sentence is all lowercase and/or punctuation belonging to the CharacterSet punctuation characters.

Checklist

Make sure you check off the following items. If they cannot be completed, provide a reason.

  • Added tests
  • Ran the ./bin/test script and it succeeded
  • Updated documentation if necessary

@emilyychenn emilyychenn force-pushed the capitalize-first-word branch from b1a0651 to 9f2c6e9 Compare April 4, 2024 17:23
/// This auto-capitalization only occurs if the first word is all lowercase and contains only characters A-Z.
/// The first word can also contain punctuation (e.g. a period, comma, hyphen, semi-colon, colon).
var capitalizeFirstWord: String {
let firstWord = self.components(separatedBy: " ").first ?? ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using CharacterSet.whitespacesAndNewlines would be more robust.

Copy link
Contributor

@daniel-grumberg daniel-grumberg Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what does this do if the entire string is just one word? or if there are leading spaces?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also avoid some temporary allocations by using substring APIs. Something like:

let firstWord = prefix(while: { !$0.isWhitespace })
guard let firstLetter = firstWord.first,
      firstLetter.isLowercase,
      !firstWord.contains(where: \.isUppercase)
else {
    return String(self)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted this to use a combination of this and Daniel's suggestion below to use CharacterSet APIs! Let me know what you think about the new implementation @daniel-grumberg @d-ronnqvist

@daniel-grumberg
Copy link
Contributor

I would love to see some integration testing where we go from the relevant markup with a parameters section to RenderJson and see the feature work end to end.

Additionally, I think having a command line flag to turn this off would be good as well but I wonder what @d-ronnqvist thinks about this.

daniel-grumberg

This comment was marked as duplicate.

Copy link
Contributor

@patshaughnessy patshaughnessy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd refactor the core function capitalizeFirstWord to see if you can avoid extra string allocations in the normal case where the first word is already capitalized. This will be called for all sentences in all doc comments, so it might add up.

Copy link
Contributor

@patshaughnessy patshaughnessy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great - thanks a lot for refactoring capitalizeFirstWord to return self most of the time I think that might be a noticeable performance difference.

I just left some coding nitpicks you can clean those up before merging if you agree.

@emilyychenn emilyychenn force-pushed the capitalize-first-word branch from 1864076 to 1e059be Compare April 12, 2024 13:52
Copy link
Contributor

@daniel-grumberg daniel-grumberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@emilyychenn emilyychenn force-pushed the capitalize-first-word branch from 8c672a5 to 6b3479b Compare April 15, 2024 10:45
@emilyychenn
Copy link
Contributor Author

@swift-ci please test

@emilyychenn emilyychenn merged commit 30110b0 into swiftlang:main Apr 15, 2024
emilyychenn added a commit that referenced this pull request Apr 15, 2024
rdar://122167705

When writing a doc comment, it feels intuitive to not capitalise the first word of its description because of the nature of the doc comment.

For example: adding a new parameter would be written as:
`/// - Parameter testParam: this parameter is just a test parameter to show what this looks like in lowercase.`
In this example, the sentence in doc comments is natural to write with a lowercase starting word, because the first word (Parameter) is already capitalized. 

This PR auto-capitalizes the first word of a new section or aside. Note that this auto-capitalization only occurs if the first word is all lowercase and contains only characters A-Z, or if the first word contains CharacterSet punctuation characters (e.g. a period, comma, hyphen, semi-colon, colon). This capitalization is also locale specific.

---------

Co-authored-by: Pat Shaughnessy <[email protected]>
Comment on lines +22 to +27
guard let firstWordStartIndex = self.firstIndex(where: { !$0.isWhitespace && !$0.isNewline }) else { return self }
let firstWord = self[firstWordStartIndex...].prefix(while: { !$0.isWhitespace && !$0.isNewline})

guard firstWord.rangeOfCharacter(from: Self.charactersPreventingWordCapitalization) == nil else {
return self
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the mix of Character and CharacterSet API here confusing. It would be better to stick to one.

This could be either something like

guard let firstWordStartIndex = self.firstIndex(where: { !$0.isWhitespace && !$0.isNewline }) else { return self }
let firstWord = self[firstWordStartIndex...].prefix(while: { !$0.isWhitespace && !$0.isNewline })
guard !firstWord.contains(where: { $0.isLowercase || $0.isPunctuation }) else {
    return self
}

or something like

guard let firstWordStartIndex = rangeOfCharacter(from: .whitespacesAndNewlines.inverted),
      let firstWordEndIndex = rangeOfCharacter(from: .whitespacesAndNewlines, range: firstWordStartIndex.lowerBound..<endIndex)
else { 
    return self
}
let firstWord = self[firstWordEndIndex.lowerBound..<firstWordEndIndex.lowerBound]

guard firstWord.rangeOfCharacter(from: Self.charactersPreventingWordCapitalization) == nil else {
    return self
}

@emilyychenn emilyychenn mentioned this pull request Apr 16, 2024
3 tasks
emilyychenn added a commit to emilyychenn/swift-docc that referenced this pull request Apr 16, 2024
rdar://122167705

When writing a doc comment, it feels intuitive to not capitalise the first word of its description because of the nature of the doc comment.

For example: adding a new parameter would be written as:
`/// - Parameter testParam: this parameter is just a test parameter to show what this looks like in lowercase.`
In this example, the sentence in doc comments is natural to write with a lowercase starting word, because the first word (Parameter) is already capitalized. 

This PR auto-capitalizes the first word of a new section or aside. Note that this auto-capitalization only occurs if the first word is all lowercase and contains only characters A-Z, or if the first word contains CharacterSet punctuation characters (e.g. a period, comma, hyphen, semi-colon, colon). This capitalization is also locale specific.

---------

Co-authored-by: Pat Shaughnessy <[email protected]>
emilyychenn added a commit that referenced this pull request Apr 19, 2024
* Auto-Capitalize First Word (#880)

rdar://122167705

When writing a doc comment, it feels intuitive to not capitalize the first word of its description because of the nature of the doc comment.

For example: adding a new parameter would be written as:
`/// - Parameter testParam: this parameter is just a test parameter to show what this looks like in lowercase.`
In this example, the sentence in doc comments is natural to write with a lowercase starting word, because the first word (Parameter) is already capitalized. 

This PR auto-capitalizes the first word of a new section or aside. Note that this auto-capitalization only occurs if the first word is all lowercase and contains only characters A-Z, or if the first word contains CharacterSet punctuation characters (e.g. a period, comma, hyphen, semi-colon, colon). This capitalization is also locale specific.

---------

* Fix to auto-capitalization PR (#888)

rdar://122167705

* Changed withFirstWordCapitalized from a computed property to a function.

* Remove unused protocol and extension, and extracted repeated capitalization logic into [RenderBlockContent] and [RenderInlineContent] extensions.

* Rename function from capitalizeFirstWord() to capitalizingFirstWord() since this is a non-mutating function.

---------

Co-authored-by: Pat Shaughnessy <[email protected]>
emilyychenn added a commit to emilyychenn/swift-docc that referenced this pull request Apr 30, 2024
rdar://122167705

When writing a doc comment, it feels intuitive to not capitalise the first word of its description because of the nature of the doc comment.

For example: adding a new parameter would be written as:
`/// - Parameter testParam: this parameter is just a test parameter to show what this looks like in lowercase.`
In this example, the sentence in doc comments is natural to write with a lowercase starting word, because the first word (Parameter) is already capitalized. 

This PR auto-capitalizes the first word of a new section or aside. Note that this auto-capitalization only occurs if the first word is all lowercase and contains only characters A-Z, or if the first word contains CharacterSet punctuation characters (e.g. a period, comma, hyphen, semi-colon, colon). This capitalization is also locale specific.

---------

Co-authored-by: Pat Shaughnessy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants