Skip to content

Commit

Permalink
Merge pull request #38 from dccuchile/develop
Browse files Browse the repository at this point in the history
Version 0.4.0
  • Loading branch information
pbadillatorrealba authored Sep 30, 2022
2 parents ec75b4f + 4cc3722 commit e3193ef
Show file tree
Hide file tree
Showing 90 changed files with 6,990 additions and 4,503 deletions.
16 changes: 0 additions & 16 deletions .circleci/config.yml

This file was deleted.

37 changes: 37 additions & 0 deletions .github/workflows/ci.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
name: Tests
on:
push:
branches:
- "master"
- "develop"
pull_request:
branches:
- "master"
- "develop"
jobs:
pytest:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.7"
cache: "pip"

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install flake8 pytest
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
if [ -f requirements-dev.txt ]; then pip install -r requirements-dev.txt; fi
- name: Lint with flake8
run: |
# stop the build if there are Python syntax errors or undefined names
flake8 wefe --count --select=E9,F63,F7,F82 --show-source --statistics
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
flake8 wefe --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Test with pytest
run: |
pytest tests
32 changes: 18 additions & 14 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,12 @@ __pycache__/
*.so

# scikit-learn specific
doc/_build/
doc/auto_examples/
doc/modules/generated/
doc/datasets/generated/
doc/api/generated/
docs/_build/
docs/_build/*
docs/auto_examples/
docs/modules/generated/
docs/datasets/generated/
docs/api/generated/

# Distribution / packaging

Expand Down Expand Up @@ -62,8 +63,9 @@ coverage.xml
*.log

# Sphinx documentation
doc/_build/
doc/generated/
docs/_build/
docs/generated/
docs/results/

# PyBuilder
target/
Expand All @@ -74,19 +76,21 @@ target/
# jupyter
.ipynb_checkpoints/

.results/*
.results
# notebook execution results
results/*
results
docs/user_guide/gender_debiased_glove.kv

# mypy cache
.mypy_cache

./doc/results/

develop.ipynb

# conda deploy
conda-deploy/
conda_deploy/

*.csv
*.xls

doc/user_guide/gender_debiased_glove.kv
# coverage files
cov.xml
test-results/junit.xml
16 changes: 16 additions & 0 deletions .readthedocs.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
version: 2

formats:
- epub
- pdf

sphinx:
configuration: docs/conf.py

python:
version: "3.7"
install:
- requirements: requirements.txt
- requirements: requirements-dev.txt
- method: pip
path: .
9 changes: 0 additions & 9 deletions .readthedocs.yml

This file was deleted.

40 changes: 17 additions & 23 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,27 +1,21 @@
Copyright (c) 2016, Vighnesh Birodkar and scikit-learn-contrib contributors
All rights reserved.
MIT License

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
Copyright (c) 2022 WEFE Team

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

* Neither the name of project-template nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
90 changes: 57 additions & 33 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,28 +1,24 @@
.. -*- mode: rst -*-
|ReadTheDocs|_ |CircleCI|_ |Conda|_ |CondaLatestRelease|_ |CondaVersion|_
|License|_ |GithubActions|_ |ReadTheDocs|_ |Downloads|_ |Pypy|_ |CondaVersion|_

.. |License| image:: https://img.shields.io/github/license/dccuchile/wefe
.. _License: https://github.com/dccuchile/wefe/blob/master/LICENSE

.. |ReadTheDocs| image:: https://readthedocs.org/projects/wefe/badge/?version=latest
.. _ReadTheDocs: https://wefe.readthedocs.io/en/latest/?badge=latest

.. |GithubActions| image:: https://github.com/dccuchile/wefe/actions/workflows/ci.yaml/badge.svg?branch=master
.. _GithubActions: https://github.com/dccuchile/wefe/actions

.. |CircleCI| image:: https://circleci.com/gh/dccuchile/wefe.svg?style=shield
.. _CircleCI: https://circleci.com/gh/dccuchile/wefe.svg?style=shield


.. |Conda| image:: https://anaconda.org/pbadilla/wefe/badges/installer/conda.svg
.. _Conda: https://anaconda.org/pbadilla/wefe/badges/installer/conda.svg


.. |CondaLatestRelease| image:: https://anaconda.org/pbadilla/wefe/badges/latest_release_date.svg
.. _CondaLatestRelease: https://anaconda.org/pbadilla/wefe/badges/latest_release_date.svg
.. |Downloads| image:: https://pepy.tech/badge/wefe
.. _Downloads: https://pepy.tech/project/wefe

.. |Pypy| image:: https://badge.fury.io/py/wefe.svg
.. _Pypy: https://pypi.org/project/wefe/

.. |CondaVersion| image:: https://anaconda.org/pbadilla/wefe/badges/version.svg
.. _CondaVersion: https://anaconda.org/pbadilla/wefe/badges/version.svg


.. _CondaVersion: https://anaconda.org/pbadilla/wefe


WEFE: The Word Embedding Fairness Evaluation Framework
Expand Down Expand Up @@ -133,38 +129,57 @@ To compile the documentation, run:
Changelog
=========

NEW DEVELOP VERSION
Version 0.4.0
-------------------
- 3 new bias mitigation methods (debias) implemented: Double Hard Debias, Half
Sibling Regression and Repulsion Attraction Neutralization.
- The library documentation of the library has been restructured.
Now, the documentation is divided into user guide and theoretical framework
The user guide does not contain theoretical information.
Instead, theoretical documentation can be found in the conceptual guides.
- Improved API documentation and examples. Added multilingual examples contributed
by the community.
- The user guides are fully executable because they are now on notebooks.
- There was also an important improvement in the API documentation and in metrics and
debias examples.
- Improved library testing mechanisms for metrics and debias methods.
- Fixed wrong repr of query. Now the sets are in the correct order.
- Greatly improved library testing mechanisms.
- Improved project documentation. Now, the documentation is divided into user guide and
theoretical framework. In addition, the user guides are fully executable because they
are now on notebooks.
- Implemented repr for WordEmbeddingModel.
- Testing CI moved from CircleCI to GithubActions.
- License changed to MIT.

Version 0.3.2
-------------
- Fixed RNSB bug where the classification labels were interchanged and could produce erroneous results when the attributes are of different sizes.
- Fixed RNSB bug where the classification labels were interchanged and could produce
erroneous results when the attributes are of different sizes.
- Fixed RNSB replication notebook
- Update of WEFE case study scores.
- Improved documentation examples for WEAT, RNSB, RIPA.
- Holdout parameter added to RNSB, which allows to indicate whether or not a holdout is performed when training the classifier.
- Holdout parameter added to RNSB, which allows to indicate whether or not a holdout
is performed when training the classifier.
- Improved printing of the RNSB evaluation.

Version 0.3.1
-------------
- Update WEFE original case study
- Hotfix: Several bug fixes for execute WEFE original Case Study.
- fetch_eds top_n_race_occupations argument set to 10.
- Preprocessing: get_embeddings_from_set now returns a list with the lost preprocessed words instead of the original ones.
- Preprocessing: get_embeddings_from_set now returns a list with the lost
preprocessed words instead of the original ones.

Version 0.3.0
-------------
- Implemented Bolukbasi et al. 2016 Hard Debias.
- Implemented Thomas Manzini et al. 2019 Multiclass Hard Debias.
- Implemented a fetch function to retrieve gn-glove female-male word sets.
- Moved the transformation logic of words, sets and queries to embeddings to its own module: preprocessing
- Enhanced the preprocessor_args and secondary_preprocessor_args metric preprocessing parameters to an list of preprocessors `preprocessors` together with the parameter `strategy` indicating whether to consider all the transformed words (`'all'`) or only the first one encountered (`'first'`).
- Renamed WordEmbeddingModel attributes ```model``` and ```model_name``` to ```wv``` and ```name``` respectively.
- Moved the transformation logic of words, sets and queries to embeddings to its own
module: preprocessing
- Enhanced the preprocessor_args and secondary_preprocessor_args metric
preprocessing parameters to an list of preprocessors `preprocessors` together with
the parameter `strategy` indicating whether to consider all the transformed words
(`'all'`) or only the first one encountered (`'first'`).
- Renamed WordEmbeddingModel attributes ```model``` and ```model_name``` to
```wv``` and ```name``` respectively.
- Renamed every run_query ```word_embedding``` argument to ```model``` in every metric.


Expand All @@ -179,21 +194,30 @@ Version 0.2.1

- Compatibility fixes.


Version 0.2.0
--------------

- Renamed optional ```run_query``` parameter ```warn_filtered_words``` to `warn_not_found_words`.
- Added ```word_preprocessor_args``` parameter to ```run_query``` that allow specifying transformations prior to searching for words in word embeddings.
- Added ```secondary_preprocessor_args``` parameter to ```run_query``` which allows specifying a second pre-processor transformation to words before searching them in word embeddings. It is not necessary to specify the first preprocessor to use this one.
- Implemented ```__getitem__``` function in ```WordEmbeddingModel```. This method allows obtaining an embedding from a word from the model stored in the instance using indexers.
- Renamed optional ```run_query``` parameter ```warn_filtered_words``` to
`warn_not_found_words`.
- Added ```word_preprocessor_args``` parameter to ```run_query``` that allow specifying
transformations prior to searching for words in word embeddings.
- Added ```secondary_preprocessor_args``` parameter to ```run_query``` which allows
specifying a second pre-processor transformation to words before searching them in
word embeddings. It is not necessary to specify the first preprocessor to use this
one.
- Implemented ```__getitem__``` function in ```WordEmbeddingModel```. This method
allows obtaining an embedding from a word from the model stored in the instance
using indexers.
- Removed underscore from class and instance variable names.
- Improved type and verification exception messages when creating objects and executing methods.
- Fix an error that appeared when calculating rankings with two columns of aggregations with the same name.
- Improved type and verification exception messages when creating objects and executing
methods.
- Fix an error that appeared when calculating rankings with two columns of aggregations
with the same name.
- Ranking correlations are now calculated using pandas ```corr``` method.
- Changed metric template, name and short_names to class variables.
- Implemented ```random_state``` in RNSB to allow replication of the experiments.
- run_query now returns as a result the default metric requested in the parameters and all calculated values that may be useful in the other variables of the dictionary.
- run_query now returns as a result the default metric requested in the parameters
and all calculated values that may be useful in the other variables of the dictionary.
- Fixed problem with api documentation: now it shows methods of the classes.
- Implemented p-value for WEAT

Expand Down
Loading

0 comments on commit e3193ef

Please sign in to comment.