Skip to content

Releases: TileDB-Inc/TileDB-ML

v0.5.0

10 May 14:04
9918515
Compare
Choose a tag to compare

What's Changed

  • Implement parallelized TensorflowTileDBDataset by @gsakkis in #132
  • Expose TileDB-ML version at runtime by @georgeSkoumas in #134
  • Bump tiledb to 0.14+ by @gsakkis in #135
  • Separate tensor generation for x and y arrays by @gsakkis in #133
  • scipy.sparse for 2D arrays and optional CSR sparse tensors by @gsakkis in #136
  • Always generate (x,y) tuples by @gsakkis in #137
  • Improve dependency versioning, add CI for multiple dependency versions, fix support for older ML versions by @gsakkis in #139
  • Τensorboard Callback Support by @ktsitsi in #138
  • Pytorch tensorboard support save and load by @ktsitsi in #141
  • Extract tensorboard callback logic in a separate module by @gsakkis in #142
  • Support tensorflow>2.5 by @gsakkis in #140

Full Changelog: v0.4.0...v0.5.0

v0.4.0

07 Apr 08:04
b3aae36
Compare
Choose a tag to compare

What's Changed

  • Return correctly sized sparse tensors by @gsakkis in #116
  • Adding unit tests for read sparse order when batch_shuffle=False by @ktsitsi in #115
  • Model API and Model Notebooks Update by @georgeSkoumas in #114
  • Separate buffer_size for x/y arrays by @gsakkis in #117
  • Specify buffer size in bytes instead of #rows by @gsakkis in #118
  • Reenable batch shuffling by @gsakkis in #119
  • Single shuffle option by @gsakkis in #120
  • Better default buffer_size for dense arrays by @gsakkis in #121
  • Use get_max_buffer_size() for dense arrays even when buffer_bytes is specified by @gsakkis in #122
  • Add PyTorchTileDBDataLoader extending torch.utils.data.DataLoader by @gsakkis in #123
  • Convert numpy arrays to dense tensors implicitly by @gsakkis in #125
  • Remove batch size from tensor generator by @gsakkis in #124
  • Replace shuffle parameter with shuffle buffer size by @gsakkis in #126
  • Faster batching for PyTorchTileDBDataLoader by @gsakkis in #127
  • Drop ThreadPoolExecutor for reading from x and y by @gsakkis in #128
  • Split _batch_utils module by @gsakkis in #129
  • Support nD-dim sparse arrays by @gsakkis in #130
  • Add prefetch parameter to PyTorchTileDBDataLoader and TensorflowTileDBDataset by @gsakkis in #131

Full Changelog: v0.3.0...v0.4.0

v0.3.0

22 Feb 10:23
1e4ffe8
Compare
Choose a tag to compare

This release includes quite a few updates.

Major Changes

  • [API change] Consolidate the dense and sparse TileDB Dataset classes for Tensorflow and PyTorch
    • Tensorflow: Merge TensorflowTileDBDenseDataset and TensorflowTileDBSparseDataset into TensorflowTileDBDataset
    • PyTorch: Merge PyTorchTileDBDenseDataset and PyTorchTileDBSparseDataset into PyTorchTileDBDataset
  • [API change] Remove __len__ method from *TileDBDatasetinstances
  • [Enhancement] Support data loading from (dense x, sparse y) arrays
  • [Enhancement] Read only the necessary dimensions and attributes
  • [Enhancement] Extensive tiledb.ml.readers refactoring
  • [Enhancement] Test and CI tooling revamp
  • [Enhancement] Fix deprecation warning for attrs with var=True and fixed len dtype
  • [Bug fix] Take into account all requested attributes for sparse datasets

What's Changed

Full Changelog: v0.2.6...v0.3.0

v0.2.6

17 Nov 12:49
b44a50c
Compare
Choose a tag to compare

This release contains the following changes.

  1. Always add timestamp when saving a model as a TileDB array for all supported frameworks, i.e., Tensorflow-Keras, PyTorch and Scikit-Learn.
  2. Tensorflow Keras sparse reader slicing in CSR format.
  3. PyTorch sparse reader slicing in CSR format.
  4. Add buffer for large batch reads to Tensorflow Keras Readers.
  5. Add buffer for large batch reads to PyTorch Readers.
  6. Batch shuffling and within batch shuffling for Tensorflow Keras readers.
  7. Batch shuffling and within batch shuffling for PyTorch readers.
  8. Parallel reads for X and Y for Tensorflow Keras readers.
  9. Parallel reads for X and Y for PyTorch readers.
  10. Example directory restructure.
  11. All example notebooks were accordingly updated.
  12. Docs were updated.

What's Changed

  • [Enhancement] Multiple Attribute Readers for PyTorch by @ktsitsi in #77
  • [Enhancement] Tighten type annotations & type check with mypy git hook & GH action by @gsakkis in #78
  • [Enhancement] Parallel batch reads for Pytorch by @ktsitsi in #79
  • [Examples] Serverless End-To-End example PyTorch by @georgeSkoumas in #80
  • [Enhancement] Parallel batch reads for TF by @ktsitsi in #81
  • [Enhancement] PyTorch Batch and Within Batch Shuffling by @georgeSkoumas in #82
  • [Enhancement] Tensorflow Batch and Within Batch Shuffling by @georgeSkoumas in #83
  • [Enhancement] Pytorch Buffer Generator by @ktsitsi in #84
  • [Enhancement] Tensorflow Buffer Generator by @ktsitsi in #85
  • [Examples] Revisit All Reader Example Notebooks by @georgeSkoumas in #86
  • [Bug] Fixing batching error using CSR format for Pytorch by @ktsitsi in #87
  • [Bug] Fixing batching error using CSR format for TF by @ktsitsi in #88
  • [Fix] Update broken docs.tiledb.com links by @gsakkis in #90
  • [Fix/Enhancement] Adding default argument current time timestamp in open 'w' mode by @ktsitsi in #89

v0.2.5

10 Sep 11:11
c3444ca
Compare
Choose a tag to compare

This release concerns a bug fix in TileDB version dependency.

v.0.2.4

10 Sep 07:30
ab9eb2d
Compare
Choose a tag to compare

This release concerns the following:

  1. Mutliple attribute readers for Dense and Sparse TileDB Arrays for the Tensorflow Data API.
  2. Tensorflow-Keras Subclassed models can now be saved as TileDB Arrays.
  3. Minor updates and bug fixes.

v0.2.3

08 Jul 12:18
884638d
Compare
Choose a tag to compare

This release is a bug fix in the model metadata part. We used to JSON serialise model metadata before store them in a Model TileDB array. This has been removed and metadata are stored in the form the users pass them.

v0.2.2

05 Jul 13:24
9e54322
Compare
Choose a tag to compare

This is a bug fix. We should remove machine learning framework imports from model base class.

v0.2.1

05 Jul 12:28
a25b9ea
Compare
Choose a tag to compare

This release contains functionality about saving and loading machine learning models as TileDB arrays for machine learning frameworks Tensorflow Keras, PyTorch and Scikit-Learn. Moreover, contains functionality for reading data from TileDB arrays natively to Tensorflow Data API and PyTorch Dataloader API (for dense and sparse arrays) to train machine learning models, using Python Generators.

Specifically, the current release contains the following.

-Save/Load Tensorflow Keras machine learning models as TileDB arrays.
-Save/Load PyTorch machine learning models as TileDB arrays.
-Save/Load Scikit-Learn machine learning models as TileDB arrays.
-Read data from dense TileDB arrays directly into Tensorflow Data API.
-Read data from sparse TileDB arrays directly into Tensorflow Data API.
-Read data from dense TileDB arrays directly into PyTorch Dataloader API.
-Read data from sparse TileDB arrays directly into PyTorch Dataloader API.
-Example notebooks for saving machine learning models as TileDB arrays locally.
-Example notebooks for saving machine learning models as TileDB arrays on TileDB-Cloud.
-Example notebooks for training Tensorflow Keras machine learning models reading data from dense TileDB arrays.
-Example notebooks for training Tensorflow Keras machine learning models reading data from sparse TileDB arrays.
-Example notebooks for training PyTorch machine learning models reading data from dense TileDB arrays.
-Example notebooks for training PyTorch machine learning models reading data from sparse TileDB arrays.