Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Codespell workflow and fix some typo #1568

Merged
merged 6 commits into from
Nov 11, 2023
Merged

Conversation

Fantu
Copy link
Contributor

@Fantu Fantu commented Nov 10, 2023

codespell github workflow help to spot typo and show them in pull request
I also fixed some typo, major spotted with codespell

this help showing typo errors in the pull request
show also typo out of PR changes
setted to only warn, so will don't fails the check even if typo are
present, thus it does not force to correct any typo, which may not even
be introduced by that PR
major was spotted with codespell
not all typo spotted by codespell are solved
one is a name and one is a variable name
one is in a regular expression and other in is a translated string (I
suppose is correct in that language)
@buhtz
Copy link
Member

buhtz commented Nov 10, 2023

Dear Fantu,
thanks for your contribution.

Modifying infrastructure should be discussed with the maintenance team first. I am sorry but we do not use workflow/actions or something else that is exclusive to Microsoft GitHub. The goal is to prevent a locked-in effect. We plan on the long run to migrate to another code hoster.

But for my learing: What is a "workflow" technically? Is it a VM or container?
And what application (a linter for example) do run in that container to check the spellings? It seems to me that this tool do check more than just the py-files?
If you can recommend a good tool we can integrate it into our current infrastructure (unit tests).

EDIT: Is it just https://github.com/codespell-project/codespell/ ?

EDIT2: Seems like a nice tool, easy to integrate in our unit tests. And it can be configured via pyproject.toml file (which we will have after migrating the project structure). To my current knowledge I see no need to use a closed source and locking-in Microsoft product for this.

@buhtz buhtz marked this pull request as draft November 10, 2023 19:35
Copy link
Member

@buhtz buhtz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow should be deleted. Integrating codespell will be handled in a separate issue.
Except one mistake (see other comment) in the rest looks OK.

But please let's wait until the other PRs (#1562, #1567, #1560) from today are merged. I expect conflicts.

CHANGES Outdated Show resolved Hide resolved
@Fantu
Copy link
Contributor Author

Fantu commented Nov 10, 2023

as you prefer if you want to abandon github, however it doesn't seem to me that Microsoft has done enough damage in these 5 years, or am I wrong?

@buhtz
Copy link
Member

buhtz commented Nov 11, 2023

it doesn't seem to me that Microsoft has done enough damage in these 5 years, or am I wrong?

To my knowledge there was no damage explicit to BIT. But they disrespect our LICENCE with feeding CoPilot with our repository.
Microsoft (MS) banned open source developers only because they are from "bad countries" (enemies) for example.
They use our work to feed their CoPilot algorithm.
Beside that behavior it is a matter of principles to me. GitHub is closed source. The company behind GitHub has proven more then once that they do not respect privacy of their users. They have no respect to Free Software and the four rights but using the Open Source label to "green wash" ("open wash" maybe?) their own image.
Their responsibilities are tide to USA government.

Technically currently it is no problem to migrate BIT to another code hoster because we do not use GitHub specific features. Even our TravisCI config might be easy to migrate if we would do this. We should keep the relationship in that current "friends with benefits" state and do not go further.

@Fantu
Copy link
Contributor Author

Fantu commented Nov 11, 2023

those defined as "artificial intelligence" are now being used everywhere, I think that they largely bring negative aspects, but it now seems like an unstoppable trend implemented even where it is counterproductive.
I suppose that even if it hadn't been acquired by Microsoft, Github would sooner or later have created a similar AI, whether for money and/or customer request.
the open source code being open and reachable by anyone is at risk of being "abused", it was before without "AI" and it is even more so now (any repository, even self-hosted, if it is indexed even on just one search engine, it is). only if it is inaccessible would it be protected (but then it would not be open source), and protected it still remains relative (until it is hacked)
just as for privacy, unfortunately I have seen that trying to achieve it involves an enormous effort and amount of time with poor results (and it gets worse and worse :( ) so personally I no longer waste too much time and energy on it. in practice, if you want true privacy (or almost) you should go and live isolated and without information technology^^'

@Fantu Fantu marked this pull request as ready for review November 11, 2023 10:02
@Fantu
Copy link
Contributor Author

Fantu commented Nov 11, 2023

removed the workflow and the wrong change

I will add codespell on my todo list and see if it make sense to integrate it into our unittests (like we do with PyLint in #1562). It might give to much false positives.

there was big list of "false positive", but I already excluded them in codespell excluding po files (for major) and few words for others, on codespell workflow was:

skip: "*.po"
ignore_words_list: manuel,dum,sistem,clude

there is something similar using codespell from command line tool installed locally and I suppose also using it from pylint (but I never used it and I not checked)

@buhtz buhtz merged commit 221e4d9 into bit-team:dev Nov 11, 2023
@aryoda
Copy link
Contributor

aryoda commented Nov 11, 2023

@Fantu Thanks a lot for investing your time to help us saving time!

@buhtz
Copy link
Member

buhtz commented Nov 11, 2023

ignore_words_list: manuel,dum,sistem,clude

Can you explain why manuel needs to be lower case no matter that the source string is upper case? Even after reading the docu I don't get it. I opened an Issue at codespell about it.

@Fantu
Copy link
Contributor Author

Fantu commented Nov 11, 2023

word to ignore must be always all lowercase, as I saw works only putting lowercase (like dictionary) and after will work also for uppercase or mixed result (so for example manuel in ignore word list will skip Manuel, manuel or MANUEL)
from man:

-L WORDS, --ignore-words-list WORDS

    comma separated list of words to be ignored by codespell. Words are case sensitive based on how they are written in the dictionary file

and example of codespell dictionary:
https://github.com/codespell-project/codespell/blob/master/codespell_lib/data/dictionary.txt
I didn't understand either initially and I understood after seeing the dictionaries from the codespell source

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants