Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Assertion on the dataset while creating MatchingData #22

Open
abhishek-ch opened this issue Jun 24, 2024 · 7 comments
Open

Add Assertion on the dataset while creating MatchingData #22

abhishek-ch opened this issue Jun 24, 2024 · 7 comments

Comments

@abhishek-ch
Copy link
Collaborator

Add few data quality check inside MatchingData.
ex: If the population column will struggle with boolean value like 0 and 1, it must catch them early

@sprivite
Copy link
Collaborator

Can you please show me the code that is causing trouble?

@abhishek-ch
Copy link
Collaborator Author

This code raised issue for me

match = matcher.get_best_match()
m_data = m.copy().get_population(0)

Assuming I have 0 and 1 in the population column

@sprivite
Copy link
Collaborator

sprivite commented Jun 24, 2024

I cannot reproduce:

from pybalance.utils.balance_calculators import *
from pybalance.utils import MatchingData
from pybalance.sim import load_paper_dataset

m =load_paper_dataset()
data = m.data
data.loc[data.population == 'pool', 'population'] = 0
data.loc[data.population == 'target', 'population'] = 1
m = MatchingData(data)
m.copy().get_population(0)

@sprivite
Copy link
Collaborator

Can you please give the steps to reproduce?

@abhishek-ch
Copy link
Collaborator Author

large_confounding_adjustment_dataset.csv
Here is the sample dataset

@sprivite
Copy link
Collaborator

What matcher are you using?

@sprivite
Copy link
Collaborator

Can you please paste the full code along with the error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants