Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: improve fixture definition by constraining dataset IDs #954

Merged
merged 8 commits into from
Dec 19, 2024

Conversation

ireneisdoomed
Copy link
Contributor

@ireneisdoomed ireneisdoomed commented Dec 16, 2024

✨ Context

We use dbldatagen to build mock dataframes given a specific schema. We use this mock dataframes to test that the logic returns data in the correct format. For content testing, we have to manually define how we want the data to look like.

In a recent PR, there is one test that is failing because the tested logic involves joining different mock datasets. The join doesn't return any common rows, therefore the test fails.

This PR aims to make the automatically generated datasets more useful by adding a constraint in the ID fields of each dataset so that joins will be successful.
After this PR is merged, the failing test in the above mentioned PR will no longer fail.

🛠 What does this PR implement

  • All main ID fields in each dataset will be between 1 and 400 (the number of rows we randomly generate)
  • Other minor constraints like providing a list of all possible study types

🙈 Missing

🚦 Before submitting

  • Do these changes cover one single feature (one change at a time)?
  • Did you read the contributor guideline?
  • Did you make sure to update the documentation with your changes?
  • Did you make sure there is no commented out code in this PR?
  • Did you follow conventional commits standards in PR title and commit messages?
  • Did you make sure the branch is up-to-date with the dev branch?
  • Did you write any new necessary tests?
  • Did you make sure the changes pass local tests (make test)?
  • Did you make sure the changes pass pre-commit rules (e.g poetry run pre-commit run --all-files)?

@@ -67,6 +67,18 @@ class StudyIndex(Dataset):
A study index dataset captures all the metadata for all studies including GWAS and Molecular QTL.
"""

VALID_TYPES = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Care to say that this should be an enum (in the future)

tests/gentropy/conftest.py Show resolved Hide resolved
@ireneisdoomed ireneisdoomed merged commit 5534908 into dev Dec 19, 2024
5 checks passed
@ireneisdoomed ireneisdoomed deleted the il-better-fixtures branch December 19, 2024 11:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants