Feature Selector Tool is a Python application designed for automatic feature selection in machine learning datasets.
It handles missing values, encodes categorical variables, analyzes feature importance, and offers powerful insights through visualizations โ supporting both classification and regression tasks.
- Automatic Handling of Missing Values
- Automatic Detection and Encoding of Categorical Variables
- Insights into Feature Importance and Distribution
- Support for both CSV and TXT datasets
- Designed for classification and regression workflows
The Feature Selector Tool includes powerful visualization methods to inspect dataset characteristics:
1. Navigate to the project directory: cd feature_selection_tool 2. Run the tool: python main.py 3. Choose a dataset using the file explorer when prompted. 4. Follow the on-screen instructions to perform feature selection.
- Python >= 3.7
- pandas == 2.1.2
- numpy == 1.26.1
- scikit-learn == 1.3.2
- feature-engine == 1.6.2
- torch == 2.1.0
- matplotlib == 3.8.1
- seaborn == 0.13.0
Install all requirements via:
pip install -r requirements.txt
- Large Datasets: May cause memory or computation issues. Consider subsampling large datasets.
- Outliers: Recommend preprocessing to handle outliers before using the tool.
- Dataset Characteristics: Some datasets may require extra preprocessing depending on complexity.
- Enhanced support for large datasets
- Improved outlier detection and handling
- Expanded visualization capabilities
- Only handles Tabular datasets
| Dataset | Description | Link |
|---|---|---|
| Heart Disease Dataset | Features related to patient heart health. | Dataset Link |
| Diabetes Dataset | Health indicators for diabetes prediction. | Dataset Link |
| MovieLens Dataset | Movie recommendation system data. | Dataset Link |
Special thanks to all contributors and users who have helped test and improve the Feature Selector Tool!


