-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Moving V1 example scripts to example/datasets folder (#369)
* Moving V1 example scripts to example/datasets folder * Separate mypy pre-commit check for examples/datasets folder
- Loading branch information
Showing
7 changed files
with
72 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Example Integration: Question Answering | ||
|
||
This example integration uses the [TruthfulQA (open-domain)](https://github.com/sylinrl/TruthfulQA) and the | ||
[HaluEval (closed-domain)](https://github.com/RUCAIBox/HaluEval/tree/main/evaluation) datasets and OpenAI's GPT models | ||
to demonstrate the question answering workflow in Kolena. | ||
|
||
## Setup | ||
|
||
This project uses [Poetry](https://python-poetry.org/) for packaging and Python dependency management. To get started, | ||
install project dependencies from [`pyproject.toml`](./pyproject.toml) by running: | ||
|
||
```shell | ||
poetry update && poetry install | ||
``` | ||
|
||
## Usage | ||
|
||
The data for this example integration lives in the publicly accessible S3 bucket `s3://kolena-public-datasets`. | ||
|
||
First, ensure that the `KOLENA_TOKEN` environment variable is populated in your environment. See our | ||
[initialization documentation](https://docs.kolena.io/installing-kolena/#initialization) for details. | ||
|
||
This project defines two scripts that perform the following operations: | ||
|
||
1. [`register_dataset.py`](question_answering/register_dataset.py) registers both datasets by default. You can also | ||
select the dataset to register by specifying `--datasets`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
[tool.poetry] | ||
name = "question_answering" | ||
version = "0.1.0" | ||
description = " Kolena Datasets Example integration for question answering" | ||
authors = ["Kolena Engineering <eng@kolena.io>"] | ||
license = "Apache-2.0" | ||
|
||
[tool.poetry.dependencies] | ||
python = ">=3.8,<3.11" | ||
kolena = ">=0.99.0,<1" | ||
s3fs = "^2022.7.1" | ||
|
||
[tool.poetry.group.dev.dependencies] | ||
pre-commit = "^2.17" | ||
pytest = "^7" | ||
pytest-depends = "^1.0.1" | ||
|
||
[build-system] | ||
requires = ["poetry-core>=1.0.0"] | ||
build-backend = "poetry.core.masonry.api" |
13 changes: 13 additions & 0 deletions
13
examples/datasets/question_answering/question_answering/__init__.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# Copyright 2021-2023 Kolena Inc. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters