Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: capitalize multi-word country and capital city names #703

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

hippietrail
Copy link
Contributor

@hippietrail hippietrail commented Feb 17, 2025

There's some subjectivity about what's regarded as a country.

Some do not fit in either proper_noun_capitalization.rs or matcher.rs

The former capitalizes every word, but some have one minor word that shouldn't be capitalized. I didn't check if that logic is already built in though.

The latter expects words separated by spaces only, but some use hyphens.

Place names that have an uncapitalized word and a hyphen therefor don't fit either. There are some similar cases.

We can use hyphen in the former, but periods don't seem to work for places with names like "St. Foo".

The latter is case sensitive so I put in 3 case variants of each: all lower, first word caps but not the other words, all words caps including minor words which shouldn't be.

For place names that have some other variant such as covering both space and hyphen or with and without accent, this adds up to a lot and it's easy to miss one. Plus any other combination of upper and lower case doesn't get flagged.

A linter to handle the most common of these would be better.

There's some subjectivity about what's regarded as a country.

Some do not fit in either `proper_noun_capitalization.rs` or `matcher.rs`

The former capitalizes every word, but some have one minor word that shouldn't be capitalized. I didn't check if that logic is already built in though.

The latter expects words separated by spaces only, but some use hyphens.

Place names that have an uncapitalized word *and* a hyphen therefor don't fit either. There are some similar cases.

We can use hyphen in the former, but periods don't seem to work for places with names like "St. Foo".

The latter is case sensitive so I put in 3 case variants of each: all lower, first word caps but not the other words, all words caps including minor words which shouldn't be.

For place names that have some other variant such as covering both space and hyphen or with and without accent, this adds up to a lot and it's easy to miss one. Plus any other combination of upper and lower case doesn't get flagged.

A linter to handle the most common of these would be better.
Copy link
Collaborator

@elijah-potter elijah-potter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate what you've done here. I would prefer that we keep as much logic as possible out of matcher.rs, if possible. It's close to being ready, but I have some nitpicks.

@elijah-potter
Copy link
Collaborator

In the big "overhaul" I just merged, there were some minor changes to the syntax for proper nouns like these. Take a look at the merge commit to see what it should look like moving forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants