This repository has been archived by the owner on May 6, 2023. It is now read-only.
Releases: AFAgarap/pt-datasets
Releases · AFAgarap/pt-datasets
Add random oversampling
- Add option to use a simple random over sampling instead of using SMOTE alone.
Rename datasets directory
Use datasets
directory instead of torch_datasets
.
Fix transformation pipeline
Fix the transformation pipeline for the following datasets:
- SVHN
- KMNIST
- COVID19 (binary and non-binary)
Fix augmentation pipeline
- Rearranged the augmentation transform pipeline.
Fix data augmentation
- Augment the training set of *MNIST datasets only.
Add preprocessing batch size for COVID19 dataset
- When loading the COVID19 datasets, we can now specify the batch size to use when we preprocess it. This is to avoid the memory exhaustion due to huge tensors.
Add oversampling function
- Oversample minority class using SMOTE
Load preprocessed COVID19 dataset
Features
- Preprocess dataset if it does not exist yet.
- Load the preprocessed dataset using the same dataset classes for COVID19 dataset.
- Specify image size for preprocessing.
Resolve dependency issues
- Fix setup issues with
tsnecuda
. - Add
cmake
andopency-python
to list of dependencies.
Add support for preprocessed COVID19 datasets
Features
- Normalize WDBC features.
- Add preprocessor module for COVID19 datasets, which can be used for resizing the dataset images and exporting them together with the labels to a
.pt
file. - Add class for preprocessed COVID19 datasets.
Bug fixes
- Convert WDBC features data type to
float32
. - Pack the test features and test labels for WDBC since what were being packed before was a tuple of test labels.
- Convert COVID19 datasets labels to
int64
.