Skip to content

Examples of how to use hypothesis for data science attribute testing (e.g. pandas dataframes)

Notifications You must be signed in to change notification settings

ronand97/hypothesis-examples

Repository files navigation

hypothesis-examples

Trying to use hypothesis to generate custom complex dataframes can be frustrating as the documentation is poor and is lacking examples. Many things I've done has been through trial and error and figuring things out manually.

Here, I will try to maintain some examples of simple and complex dataframe generation which can be used as a reference in the future.

Limitations

Unique strings

I have not figured out how to generate unique strings. For this, I generate UUIDs and cast them to strings. For example, using pandas, this might look like:

import hypothesis.strategies as st
import hypothesis.extra.pandas as hpd
from hypothesis import given
import pandas as pd
from pandas.api.types import is_string_dtype

@given(hpd.dataframes(columns=[
        hpd.column("unique_strs", st.uuids())
    ])
)
def test_unique_cols(df: pd.DataFrame) -> None:
    df["unique_strs"] = df["unique_strs"].astype(str)
    assert is_string_dtype(df["unique_strs"])
    assert len(df) == df["unique_strs"].nunique()

About

Examples of how to use hypothesis for data science attribute testing (e.g. pandas dataframes)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages