ml-fuzzy-matching

📂 Files

eda_cleaning.ipynb: Data loading, eda, cleaning, finalizing column.
utils.py
data.py: Load and prepare training data (positive + negative pairs).
features.py: Extract similarity features from string pairs.
model.py: Train and evaluate LR and XGBoost Model. and depending on performances save the better model.
evaluate.py: The cli driver for comparison
match.py: Inference: take a user query and return best match
config.py: Centralized constants (thresholds, paths, etc.).
demo.ipynb: notebook to test everything end-to-end and essentially serves as a demo for others to see how to work with this git repo.

Correct Working Pipeline:

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
__pycache__		__pycache__
LICENSE		LICENSE
README.md		README.md
config.py		config.py
data.py		data.py
db_file.csv		db_file.csv
demo.ipynb		demo.ipynb
eda_cleaning.ipynb		eda_cleaning.ipynb
evaluate.py		evaluate.py
features.py		features.py
lr_fuzzy_model.pkl		lr_fuzzy_model.pkl
match.py		match.py
model.py		model.py
product.csv		product.csv
product_clean.csv		product_clean.csv
utils.py		utils.py