Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added vale to CLI #2568

Open
wants to merge 26 commits into
base: main
Choose a base branch
from
Open

added vale to CLI #2568

wants to merge 26 commits into from

Conversation

sanika-n
Copy link
Contributor

@sanika-n sanika-n commented Dec 24, 2024

Summary

I added Vale, a prose linter to CLI as requested in "Add prose linter to CI #1572 " issue. So, each time a user pushes/ creates a PR, the CLI automatically runs, and its suggestions and warnings can be viewed. Currently, I have configured it to only check on .md files, but that can be changed according to use case by altering the " .vale.ini "file.

Screenshots

Currently after running on the existing repo, vale made a lot of recommendation and suggestions. I intentionally created a "bad" .md file titled TESTVALE.md just to test it, the results are attached below:
image
image
It gave some errors and warnings on the existing files as well. I have attached a log to that below.
4_Run Vale.txt

Closes #1572.

@EwoutH
Copy link
Member

EwoutH commented Dec 24, 2024

Thanks for the PR! I appreciate

It's currently over a thousands lines of code spread over 45 files, which is quite challenging to review. I also prefer not to add such amounts of code to our codebase. Are all the files critical for this functionality to function? Anything you might be able to cleanup?

Edit: Going quickly though the log, the warning Prefer 'personal digital assistant' over 'agent'. or (Agent) appears a lot. This is probably an LLM artifact.

Also, could Vale be integrated into our pre-commit config?

Edit 2: It seems Vale supports pre-commit!

.github/workflows/vale.yml Outdated Show resolved Hide resolved
@sanika-n
Copy link
Contributor Author

@EwoutH, I updated the version of vale and also suppressed vale to allow the use of "agent". I had actually cloned code from this
repo to -> https://github.com/errata-ai/Microsoft/tree/master/Microsoft to serve as the style for vale and hence the large amount of files, I tried to find a work around, but the only option seems to be to use it as a submodule...
I did integrate vale to pre-commit and it does fail as not all md files are following vale's rules.
image

However, the reason for failure in the checks above, (pre-commit.ci -pr) says that it failed because it wasn't able to find vale,
I am not able to figure out why that is happening
image

@EwoutH
Copy link
Member

EwoutH commented Dec 25, 2024

Maybe it’s easier to start with a minimal implementation, using only pre-commit, and then extend from there.

@sanika-n
Copy link
Contributor Author

@EwoutH, Hi, I think I figured it out it is working now:)
image

@quaquel
Copy link
Member

quaquel commented Dec 28, 2024

With 45 files changed, this PR has become impossible to review. I see a lot of value in adding this, however, would it be possible to have only this one change? Next, in follow up pr's we can go throught the code base to fix all issues that are flagged after vale has been activated.

@sanika-n
Copy link
Contributor Author

Sure, so from what I understand you would like me to cut down on the number of files right? I think one thing that I could is change the style from Microsoft to Write-Good (https://github.com/errata-ai/write-good/tree/master/write-good) this only has 9 yml files...Would that be fine?

@quaquel
Copy link
Member

quaquel commented Dec 28, 2024

My bad, I had not carefully checked the PR. It is not 45 files changed, but 1 file changed (pre-commit) and 44 new files required for vale. Can you briefly explain the options and why you went with the Microsoft style?

@sanika-n
Copy link
Contributor Author

Sure, so officially these are the styles that vale recognizes:
image
(https://github.com/errata-ai/packages)

I chose Microsoft because it has a wider range of error catching abilities and can focus on a lot more writing elements compared to some other options. Write-good is alright too but it is more basic and won't be able to catch errors that Microsoft can.

@quaquel
Copy link
Member

quaquel commented Dec 30, 2024

Finally had time to look closer at Vale. It seems they suggest to put the styles and .vale folders inside the .github folder. Can you ensure the layout follows their suggestions? Once that is done, I think this is good to go.

@sanika-n
Copy link
Contributor Author

Done 👍

@quaquel
Copy link
Member

quaquel commented Dec 31, 2024

I think this looks good, but it would be good if one of the other maintainers also takes a look.

@EwoutH
Copy link
Member

EwoutH commented Dec 31, 2024

Thanks for working on it.

When I review an PR one of the first things I do is judge the costs-benefit ratio of the PR. Each new feature adds complexity, is something that can break, and is something we need to update and maintain.

With this PR, I'm not yet fully convinced that ratio/proportionality is there.

  • Benefits: We already have Codespell for spelling/language, a code linter (ruff), a docstring linter (ruff-format) and a few smaller checkers for things like whitespaces. That seems to cover most stuff, I don't see the distinct value of Vale for us.
  • Costs: This PR adds a huge fileset that probably needs to be updated periodically. Can that be automated? A git submodule might be possible, but since it's the first, we need to update that.

So, while neither on their own are a problem, I currently don't see the ratio here. Either (personally) I need to be convinced of the benefits a prose linter adds, or the complexity needs to come way down (thing of one or at most two files that can be updated automatically periodically). Can we use the Microsoft or Google standard in some way without copying all their files to our repo?

@sanika-n
Copy link
Contributor Author

I completely understand, I did try to find alternatives to copying their repo, but I couldn't find much results... will surely update if I find a workaround

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add prose linter to CI
3 participants