WIP: Implement Vale #3065

ngken0995 · 2023-09-26T16:10:59Z

Fixes

Potentially Fixes #3566 by @dhruvkb

Description

Add pre-commit hook for vale, create .vale.ini to determine what style should be implemented. Codespell should ignore editorial style guide.

I believe there is a bug in my code or I'm still missing some information while running pre-commit because vale will state there is no files to check. vale.................................................(no files to check)Skipped. Once I run just lint, all the error will be populated about the documentation folder.

Example:

 documentation/frontend/guides/test.md
 3:48  error  Use 'it's' instead of 'it is'.  Microsoft.Contractions

Testing Instructions

Checklist

My pull request has a descriptive title (not a vague title likeUpdate index.md).
My pull request targets the default branch of the repository (main) or a parent feature branch.
My commit messages follow best practices.
My code follows the established code style of the repository.
[N/A] I added or updated tests for the changes I made (if applicable).
[N/A] I added or updated documentation (if applicable).
[N/A] I tried running the project locally and verified that there are no visible errors.
[N/A] I ran the DAG documentation generator (if applicable).

Developer Certificate of Origin

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

dhruvkb

This works well! I checked out the branch and ran just lint.

✖ 74 errors, 258 warnings and 315 suggestions in 20 files.

That is a lot of violations, especially considering it's only applied to the documentation/ folder for now. I think the following two changes are needed so that the PR passes CI and can be merged.

Some of the most violated rules need to be disabled, if they're not very serious. For example, Microsoft.ComplexWords and Microsoft.We don't really apply in our case.
Violations of more serious rules like Microsoft.Accessibility should be addressed and resolved.

@ngken0995 would you like to work on these changes? If yes, that's great, thank you! If not, we can keep it open and make it the base branch for subsequent PRs that adddress these changes and we'll merge it once it passes CI.

In either case, I'll convert it to WIP/draft for now.

.pre-commit-config.yaml

.vale.ini

ngken0995 · 2023-10-04T20:18:09Z

@dhruvkb I'll work on it. I'll change some of the violation rule to suggestions and edit Microsoft.Accessibility statements.

ngken0995 · 2023-10-05T15:51:54Z

@dhruvkb I have changed all style with error level to suggestion level except Microsoft.Accessibility. Let me know which other style guide should be set to error level.

dhruvkb · 2023-10-05T16:06:15Z

That's a massive effort @ngken0995, thank you so much! I'll try to review this asap. Since it's a pretty big PR, it can take a while so feel free to work on something else in the meantime.

dhruvkb

The changes look good to me except for some very aggressive rules related to words like 'disabled' and 'normal', which may be acceptable as long as they're not referring to people.

@WordPress/openverse-maintainers should we consider adding those words to an allowlist? Or should we update our documentation to not use these words as all and opt for alternatives (like 'deactivated' and 'standard' as this PR does)?

dhruvkb · 2023-10-16T07:36:32Z

documentation/general/deployment.md

-Once the service has determined the new task to be healthy, it will redirect all
+Once the service has determined the new task to be active, it will redirect all


Here 'healthy' is the technical term, cannot change to 'active'. Maybe we can add an exception for this?

dhruvkb · 2023-10-16T07:40:49Z

styles/Microsoft/Accessibility.yml

+  - affected by
+  - an epileptic
+  - crippled
+  - disabled


Having this in the list adds a lot of changes when it refers to a toggle or a feature and not a person.

dhruvkb · 2023-10-16T07:42:26Z

styles/Microsoft/Accessibility.yml

+  - maimed
+  - missing a limb
+  - mute
+  - normal


Having this in the list is a bit excessive. Afaik 'normal' has no negative connotations in regular speech.

As with other things, my only perception of "normal" as a negative or derogatory thing is in reference to people (culture, ability, etc). The behaviour of an application is something I personally feel comfortable classifying as "normal" and "abnormal". However, that's my personal sense of it, I've asked another person and they also felt that so long as it wasn't in reference to a person it didn't seem bad, and avoiding it would complicate the language (with the adverse effect of accidentally making the written text less accessible, from a general linguistic perspective).

"Normal" also doesn't immediately evoke the same kinds of historical violence when not used to describe a person. Whereas other terms like "master/slave", "nuke", etc, directly and unambiguously evoke tremendous violence, regardless of the context. It makes perfect sense to avoid those, especially because there are actually clearer terms to use. Same with blacklist/whitelist (deny/allow is much clearer).

The rule message at the top of this file is:

message: "Don't use language (such as '%s') that defines people by their disability."

We have only a handful of documents that this would ever be relevant to (where we talk about people and their behaviour), namely the decision-making process and the code of conduct. I get the sense that Microsoft created this particular rule for non-technical documentation, like management or HR documents? There are probably some software domains this is relevant to, not sure Openverse fits into that domain. Maybe HR software, education software, hospital software, etc?

That's just my 2-cents. I don't mind changing the words we use, especially because "inactive" and "deactivated" are just as clear as "disabled" if used in the right contexts. "Standard" feels like less of an unambiguous alternative to "normal", especially because in software it evokes standards, but doesn't feel sufficiently different for me to object to changing it if someone feels strongly about it not being used ever. Though my sense of the disability activism space is that no one would care about using "normal" to talk about the behaviour of a computer program. I also have a hard time imagining that a derogatory/negative usage of the word would get past our code review process, considering the concern and care Openverse maintainers have historically had for a wide range of other issues.

So anyway, I agree that we could remove this rule altogether. Its description of itself shows that it's not really for software documentation. On the other hand, if someone feels strongly that we should add it, fine by me as well. Maybe we can remove it for now to minimise the changes in this PR and get input from other maintainers afterwards, and make appropriate adjustments as needed?

Oh, of course, the other option is to leave the rule, but remove the words you've noted are common in software without a negative connotation: normal, disabled, healthy, unhealthy. Can't hurt to have an automatic check for the other words here.

ngken0995 · 2023-10-18T18:36:08Z

@WordPress/openverse-maintainers Microsoft isn't the only style guide. Here is a link for more styles: https://vale.sh/types/style/

sarayourfriend · 2023-10-24T23:45:22Z

@ngken0995 Can we try alex and proselint? We have another issue to use alex here, so that would allow us to close that one here as well. It has, I think, much better automatic suggestions for the words that will make it easier to transition our prose. It does include the terms disabled, but has much broader suggested alternatives that, at least to me, make it easier to find phrases that are indeed clearer. Would that be okay? Thanks very much for your patience so far on this issue. We've been dealing with some complex internal issues with our production services alongside folks being away from directly working on Openverse here and there, and it's been hard to keep up with reviews for all of us recently.

ngken0995 · 2023-10-25T21:43:41Z

@ngken0995 Can we try alex and proselint? We have another issue to use alex here, so that would allow us to close that one here as well. It has, I think, much better automatic suggestions for the words that will make it easier to transition our prose. It does include the terms disabled, but has much broader suggested alternatives that, at least to me, make it easier to find phrases that are indeed clearer. Would that be okay? Thanks very much for your patience so far on this issue. We've been dealing with some complex internal issues with our production services alongside folks being away from directly working on Openverse here and there, and it's been hard to keep up with reviews for all of us recently.

@sarayourfriend I have removed Microsoft and write-good and add alex and proselint. I have edit all file with level: error to level: suggestion. Let me know which file should be set to error or warning then I can work on editing the documentations.

sarayourfriend · 2023-10-30T05:31:22Z

Thanks for the update. I'm reading through these lists to give feedback on what to remove, upgrade, etc, and I'm struck by two things.

First, the project has gone to great extents to avoid exposing contributors to many of these terms, especially violent slurs. We keep our own list of such terms for filtering indexed works that have potentially sensitive textual content in an entirely separate repository, with only a subset of maintainers asked to monitor and contribute to it. The goal has been to avoid keeping any list like this in our main repository, with the understanding that it's actually inaccessible to make it so easy to accidentally come across them. Some of these lists (rightly!) include really horrific terms, so we'd be going directly against previous diligence to directly include them.

Second, I started to think it would be nice to just switch to alex and prose-lint directly. However, it's clear that Vale is a much better tool, both in that it is much faster than either of those tools and in its deep customisability.

With that in mind, and with the interest of keeping going with Vale to supersede directly using those other linters, I'd like to suggest the following: rather than including our Vale configuration directly in the monorepo, let's move it to https://github.com/WordPress/openverse-sensitive-terms and then build a docker image, published to GHCR, that includes the configuration baked in.

The goal behind this suggestion is to be able to continue to use Vale to its greatest, customise extent, without breaking the previous agreement we made as a project to not include word lists like this directly in our main repository. Further, I think a lot of the term lists that we'll get from this could feed directly into the sensitive terms list we use for identifying potentially sensitive textual content in indexed works. In other words, these lists are actually directly relevant to the other work we do with sensitive terms lists, so it wouldn't be misplaced to put them there.

So @ngken0995, would you be interested in working on an approach based on this idea in the sensitive terms repository? I understand if not, especially given how sporadic feedback has been on this PR (apologies again for that, things have been difficult these past few weeks for the core maintainers!), just let me know either way. The way I see it working is:

Move the styles into a branch on the other repository
Create a Vale.dockerfile that extends https://hub.docker.com/r/jdkato/vale to bake in our styles and .vale.ini, creating an entrypoint script that automatically passes the --config option to the vale command.
Create a GitHub workflow to publish that image to GHCR under openverse-vale (for this you will probably need help from a maintainer, or a maintainer will need to do it themselves, due to permissions issues)
Add a new pre-commit hook in our .pre-commit-config.yaml using the docker_image language and the openverse-vale:latest image

@dhruvkb Can you give your thoughts whether that approach makes sense? I don't feel comfortable at all committing these lists to the monorepo, and this is one solution I can think of to avoid it. If either of y'all have alternative ideas, please say so 🙏

dhruvkb · 2023-10-30T06:43:06Z

I agree with @sarayourfriend's idea to utilise the sensitive terms repo for this and use a Docker image to run the linting on content in this repo. It's very well thought out. It is however a bit more complex now, so @ngken0995 please let us know if we can help out in any way e.g. drafting issues or providing suggestions to your drafts.

ngken0995 · 2023-10-30T17:13:30Z

@sarayourfriend Initially, I was using alex as one of the style because there was another PR which suggest using it. Then the really horrific terms showed up and it felt uncomfortable which I decided to switch to Microsoft style. Thank you for the explanation. The proposal looks amazing and intriguing to work.
@dhruvkb Thank you for the suggestion for drafting issues. I wouldn't know where to begin. Please ping me when the issues are created. I would love to work on them.

openverse-bot · 2023-11-01T00:00:14Z

Based on the low urgency of this PR, the following reviewers are being gently reminded to review this PR:

@AetherUnbound
@obulat
This reminder is being automatically generated due to the urgency configuration.

Excluding weekend¹ days, this PR was ready for review 17 day(s) ago. PRs labelled with low urgency are expected to be reviewed within 5 weekday(s)².

@ngken0995, if this PR is not ready for a review, please draft it to prevent reviewers from getting further unnecessary pings.

Specifically, Saturday and Sunday. ↩
For the purpose of these reminders we treat Monday - Friday as weekdays. Please note that the operation that generates these reminders runs at midnight UTC on Monday - Friday. This means that depending on your timezone, you may be pinged outside of the expected range. ↩

sarayourfriend · 2023-11-01T03:08:03Z

@ngken0995 Here is the tracking issue in the other repository: WordPress/openverse-sensitive-terms#6

I'll close this PR now in favour of completing the prerequisite work first. Thanks again for your patience and willingness to work through and think about this issue carefully. We really appreciate all your contributions and hard work!

ngken0995 requested a review from a team as a code owner September 26, 2023 16:10

ngken0995 requested review from AetherUnbound and obulat September 26, 2023 16:11

openverse-bot added 🟩 priority: low Low priority and doesn't need to be rushed ✨ goal: improvement Improvement to an existing user-facing feature 📄 aspect: text Concerns the textual material in the repository 🧱 stack: documentation Related to Sphinx documentation labels Sep 26, 2023

dhruvkb requested changes Oct 4, 2023

View reviewed changes

.pre-commit-config.yaml Outdated Show resolved Hide resolved

.vale.ini Outdated Show resolved Hide resolved

dhruvkb marked this pull request as draft October 4, 2023 05:45

Empty-Commit

d51fd64

ngken0995 force-pushed the implement-vale branch from 623c019 to d51fd64 Compare October 4, 2023 20:49

ngken0995 added 3 commits October 4, 2023 16:50

implement vale

a8bd47a

edit word for Microsoft.Accessibility

c983368

revise words for Microsoft.Accessibility

ab90fbf

ngken0995 requested a review from dhruvkb October 5, 2023 15:50

dhruvkb marked this pull request as ready for review October 6, 2023 11:23

dhruvkb reviewed Oct 16, 2023

View reviewed changes

remove microsoft and write-good and add alex and proselint

7b5656d

This was referenced Nov 1, 2023

Implement safe Vale styles WordPress/openverse-sensitive-terms#6

Open

Integrate Vale for linting docs according to editorial style guide #3566

Closed

sarayourfriend closed this Nov 1, 2023

sarayourfriend mentioned this pull request Nov 16, 2023

Integrate 'alex' to ensure our Markdown docs are considerate #368

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Implement Vale #3065

WIP: Implement Vale #3065

ngken0995 commented Sep 26, 2023

dhruvkb left a comment

ngken0995 commented Oct 4, 2023

ngken0995 commented Oct 5, 2023

dhruvkb commented Oct 5, 2023

dhruvkb left a comment

dhruvkb Oct 16, 2023

dhruvkb Oct 16, 2023

dhruvkb Oct 16, 2023

sarayourfriend Oct 18, 2023

sarayourfriend Oct 18, 2023

ngken0995 commented Oct 18, 2023

sarayourfriend commented Oct 24, 2023 •

edited

Loading

ngken0995 commented Oct 25, 2023

sarayourfriend commented Oct 30, 2023

dhruvkb commented Oct 30, 2023

ngken0995 commented Oct 30, 2023 •

edited

Loading

openverse-bot commented Nov 1, 2023

sarayourfriend commented Nov 1, 2023

		Once the service has determined the new task to be healthy, it will redirect all
		Once the service has determined the new task to be active, it will redirect all

WIP: Implement Vale #3065

WIP: Implement Vale #3065

Conversation

ngken0995 commented Sep 26, 2023

Fixes

Description

Testing Instructions

Checklist

Developer Certificate of Origin

dhruvkb left a comment

Choose a reason for hiding this comment

ngken0995 commented Oct 4, 2023

ngken0995 commented Oct 5, 2023

dhruvkb commented Oct 5, 2023

dhruvkb left a comment

Choose a reason for hiding this comment

dhruvkb Oct 16, 2023

Choose a reason for hiding this comment

dhruvkb Oct 16, 2023

Choose a reason for hiding this comment

dhruvkb Oct 16, 2023

Choose a reason for hiding this comment

sarayourfriend Oct 18, 2023

Choose a reason for hiding this comment

sarayourfriend Oct 18, 2023

Choose a reason for hiding this comment

ngken0995 commented Oct 18, 2023

sarayourfriend commented Oct 24, 2023 • edited Loading

ngken0995 commented Oct 25, 2023

sarayourfriend commented Oct 30, 2023

dhruvkb commented Oct 30, 2023

ngken0995 commented Oct 30, 2023 • edited Loading

openverse-bot commented Nov 1, 2023

Footnotes

sarayourfriend commented Nov 1, 2023

sarayourfriend commented Oct 24, 2023 •

edited

Loading

ngken0995 commented Oct 30, 2023 •

edited

Loading