Releases: TileDB-Inc/TileDB-ML
v0.5.0
What's Changed
- Implement parallelized TensorflowTileDBDataset by @gsakkis in #132
- Expose TileDB-ML version at runtime by @georgeSkoumas in #134
- Bump tiledb to 0.14+ by @gsakkis in #135
- Separate tensor generation for x and y arrays by @gsakkis in #133
- scipy.sparse for 2D arrays and optional CSR sparse tensors by @gsakkis in #136
- Always generate (x,y) tuples by @gsakkis in #137
- Improve dependency versioning, add CI for multiple dependency versions, fix support for older ML versions by @gsakkis in #139
- Τensorboard Callback Support by @ktsitsi in #138
- Pytorch tensorboard support save and load by @ktsitsi in #141
- Extract tensorboard callback logic in a separate module by @gsakkis in #142
- Support tensorflow>2.5 by @gsakkis in #140
Full Changelog: v0.4.0...v0.5.0
v0.4.0
What's Changed
- Return correctly sized sparse tensors by @gsakkis in #116
- Adding unit tests for read sparse order when batch_shuffle=False by @ktsitsi in #115
- Model API and Model Notebooks Update by @georgeSkoumas in #114
- Separate buffer_size for x/y arrays by @gsakkis in #117
- Specify buffer size in bytes instead of #rows by @gsakkis in #118
- Reenable batch shuffling by @gsakkis in #119
- Single shuffle option by @gsakkis in #120
- Better default buffer_size for dense arrays by @gsakkis in #121
- Use get_max_buffer_size() for dense arrays even when buffer_bytes is specified by @gsakkis in #122
- Add PyTorchTileDBDataLoader extending torch.utils.data.DataLoader by @gsakkis in #123
- Convert numpy arrays to dense tensors implicitly by @gsakkis in #125
- Remove batch size from tensor generator by @gsakkis in #124
- Replace shuffle parameter with shuffle buffer size by @gsakkis in #126
- Faster batching for PyTorchTileDBDataLoader by @gsakkis in #127
- Drop ThreadPoolExecutor for reading from x and y by @gsakkis in #128
- Split _batch_utils module by @gsakkis in #129
- Support nD-dim sparse arrays by @gsakkis in #130
- Add prefetch parameter to PyTorchTileDBDataLoader and TensorflowTileDBDataset by @gsakkis in #131
Full Changelog: v0.3.0...v0.4.0
v0.3.0
This release includes quite a few updates.
Major Changes
- [API change] Consolidate the dense and sparse TileDB Dataset classes for Tensorflow and PyTorch
- Tensorflow: Merge
TensorflowTileDBDenseDataset
andTensorflowTileDBSparseDataset
intoTensorflowTileDBDataset
- PyTorch: Merge
PyTorchTileDBDenseDataset
andPyTorchTileDBSparseDataset
intoPyTorchTileDBDataset
- Tensorflow: Merge
- [API change] Remove
__len__
method from*TileDBDataset
instances - [Enhancement] Support data loading from (dense x, sparse y) arrays
- [Enhancement] Read only the necessary dimensions and attributes
- [Enhancement] Extensive
tiledb.ml.readers
refactoring - [Enhancement] Test and CI tooling revamp
- [Enhancement] Fix deprecation warning for attrs with var=True and fixed len dtype
- [Bug fix] Take into account all requested attributes for sparse datasets
What's Changed
- Add dense and sparse checks in Tensoflow and PyTorch data APIs by @georgeSkoumas in #91
- Fix broken links on README by @georgeSkoumas in #92
- Readme update by @georgeSkoumas in #93
- Test tweaks by @gsakkis in #94
- Tooling & test improvements by @gsakkis in #95
- Fix deprecation warning for attrs with var=True and fixed len dtype by @gsakkis in #96
- Move internal utils modules by @gsakkis in #97
- Fix CI Test Coverage Badge action and run only on master by @gsakkis in #98
- Refactor tensorflow datasets by @gsakkis in #99
- Fix max workers to 2 by @ktsitsi in #100
- Refactor tensorflow generators by @gsakkis in #101
- Refactor Pytorch datasets by @gsakkis in #102
- Fix: leave the cardinality of TensorflowTileDBDataset unknown by @gsakkis in #103
- Refactor common logic of Tensorflow and Pytorch data loaders by @gsakkis in #104
- Update reader notebook examples by @gsakkis in #105
- Drop
PyTorchTileDBDataset.__len__
by @gsakkis in #106 - Re-run all model examples by @georgeSkoumas in #107
- Data loading for (dense, x sparse y) by @gsakkis in #109
- Fix BaseSparseBatch.set_buffer_offset by @gsakkis in #111
- Refactor data loader tests v2 by @gsakkis in #110
- Refactor data loader tests by @gsakkis in #108
- BaseSparseBatch fix: take into account all requested attributes by @gsakkis in #112
- Read only the necessary dimensions and attributes by @gsakkis in #113
Full Changelog: v0.2.6...v0.3.0
v0.2.6
This release contains the following changes.
- Always add timestamp when saving a model as a TileDB array for all supported frameworks, i.e., Tensorflow-Keras, PyTorch and Scikit-Learn.
- Tensorflow Keras sparse reader slicing in CSR format.
- PyTorch sparse reader slicing in CSR format.
- Add buffer for large batch reads to Tensorflow Keras Readers.
- Add buffer for large batch reads to PyTorch Readers.
- Batch shuffling and within batch shuffling for Tensorflow Keras readers.
- Batch shuffling and within batch shuffling for PyTorch readers.
- Parallel reads for X and Y for Tensorflow Keras readers.
- Parallel reads for X and Y for PyTorch readers.
- Example directory restructure.
- All example notebooks were accordingly updated.
- Docs were updated.
What's Changed
- [Enhancement] Multiple Attribute Readers for PyTorch by @ktsitsi in #77
- [Enhancement] Tighten type annotations & type check with mypy git hook & GH action by @gsakkis in #78
- [Enhancement] Parallel batch reads for Pytorch by @ktsitsi in #79
- [Examples] Serverless End-To-End example PyTorch by @georgeSkoumas in #80
- [Enhancement] Parallel batch reads for TF by @ktsitsi in #81
- [Enhancement] PyTorch Batch and Within Batch Shuffling by @georgeSkoumas in #82
- [Enhancement] Tensorflow Batch and Within Batch Shuffling by @georgeSkoumas in #83
- [Enhancement] Pytorch Buffer Generator by @ktsitsi in #84
- [Enhancement] Tensorflow Buffer Generator by @ktsitsi in #85
- [Examples] Revisit All Reader Example Notebooks by @georgeSkoumas in #86
- [Bug] Fixing batching error using CSR format for Pytorch by @ktsitsi in #87
- [Bug] Fixing batching error using CSR format for TF by @ktsitsi in #88
- [Fix] Update broken docs.tiledb.com links by @gsakkis in #90
- [Fix/Enhancement] Adding default argument current time timestamp in open 'w' mode by @ktsitsi in #89
v0.2.5
This release concerns a bug fix in TileDB version dependency.
v.0.2.4
This release concerns the following:
- Mutliple attribute readers for Dense and Sparse TileDB Arrays for the Tensorflow Data API.
- Tensorflow-Keras Subclassed models can now be saved as TileDB Arrays.
- Minor updates and bug fixes.
v0.2.3
This release is a bug fix in the model metadata part. We used to JSON serialise model metadata before store them in a Model TileDB array. This has been removed and metadata are stored in the form the users pass them.
v0.2.2
This is a bug fix. We should remove machine learning framework imports from model base class.
v0.2.1
This release contains functionality about saving and loading machine learning models as TileDB arrays for machine learning frameworks Tensorflow Keras, PyTorch and Scikit-Learn. Moreover, contains functionality for reading data from TileDB arrays natively to Tensorflow Data API and PyTorch Dataloader API (for dense and sparse arrays) to train machine learning models, using Python Generators.
Specifically, the current release contains the following.
-Save/Load Tensorflow Keras machine learning models as TileDB arrays.
-Save/Load PyTorch machine learning models as TileDB arrays.
-Save/Load Scikit-Learn machine learning models as TileDB arrays.
-Read data from dense TileDB arrays directly into Tensorflow Data API.
-Read data from sparse TileDB arrays directly into Tensorflow Data API.
-Read data from dense TileDB arrays directly into PyTorch Dataloader API.
-Read data from sparse TileDB arrays directly into PyTorch Dataloader API.
-Example notebooks for saving machine learning models as TileDB arrays locally.
-Example notebooks for saving machine learning models as TileDB arrays on TileDB-Cloud.
-Example notebooks for training Tensorflow Keras machine learning models reading data from dense TileDB arrays.
-Example notebooks for training Tensorflow Keras machine learning models reading data from sparse TileDB arrays.
-Example notebooks for training PyTorch machine learning models reading data from dense TileDB arrays.
-Example notebooks for training PyTorch machine learning models reading data from sparse TileDB arrays.