Skip to content

Releases: argilla-io/argilla

v1.4.1

30 Mar 17:30
Compare
Choose a tag to compare

1.4.1

Bug Fixes

  • Copying datasets between workspaces with proper owner/workspace info. Closes #2562
  • Copy dataset with empty workspace to the default user workspace 905d4de
  • Using elasticsearch config to request backend version. Closes #2311

v1.3.2

30 Mar 16:29
59bb5ff
Compare
Choose a tag to compare

1.3.2

Bug Fixes

  • Copying datasets between workspaces with proper owner/workspace info. Closes #2562
  • Copy dataset with empty workspace to the default user workspace 905d4de
  • Using elasticsearch config to request backend version. Closes #2311

v1.2.2

30 Mar 16:29
Compare
Choose a tag to compare

1.2.2

Bug Fixes

  • Copying datasets between workspaces with proper owner/workspace info. Closes #2562
  • Copy dataset with empty workspace to the default user workspace 905d4de
  • Using elasticsearch config to request backend version. Closes #2311

v.1.5.0

22 Mar 16:19
7062819
Compare
Choose a tag to compare

🔆 Highlights

Dataset Settings page

Captura de pantalla 2023-03-23 a las 8 28 31

We have added a Settings page for your datasets. From there, you will be able to manage your dataset. Currently, it is possible to add labels to your labeling schema and delete the dataset.

Add images to your records

Captura de pantalla 2023-03-23 a las 9 48 52

The image in this record was generated using https://robohash.org

You can pass a URL in the metadata field _image_url and the image will be rendered in the Argilla UI. You can use this in the Text Classification and the Token Classification tasks.

Non-searchable metadata fields

Apart from the _image_url field you can also pass other metadata fields that won't be used in queries or filters by adding an underscore at the start e.g. _my_field.

Load only what you need using rg.load

You can now specify the fields you want to load from your Argilla dataset. That way, you can avoid loading heavy vectors if you're using them for your annotations.

Two new tutorials (kudos @embonhomme & @burtenshaw)

Check out our new tutorials created by the community!

  • Compare the performance of two text classification models here
  • Multimodal bulk annotation here

Changelog

All notable changes to this project will be documented in this file. See standard-version for commit guidelines.

1.5.0 - 2023-03-21

Added

  • Add the fields to retrieve when loading the data from argilla. rg.load takes too long because of the vector field, even when users don't need it. Closes #2398
  • Add new page and components for dataset settings. Closes #2442
  • Add ability to show image in records (for TokenClassification and TextClassification) if an URL is passed in metadata with the key _image_url
  • Non-searchable fields support in metadata. #2570

Changed

  • Labels are now centralized in a specific vuex ORM called GlobalLabel Model, see #2210. This model is the same for TokenClassification and TextClassification (so both task have labels with color_id and shortcuts parameters in the vuex ORM)
  • The shortcuts improvement for labels #2339 have been moved to the vuex ORM in dataset settings feature #2444
  • Update "Define a labeling schema" section in docs.
  • The record inputs are sorted alphabetically in UI by default. #2581

Fixes

  • Allow URL to be clickable in Jupyter notebook again. Closes #2527

Removed

  • Removing some data scan deprecated endpoints used by old clients. This change will break compatibility with client <v1.3.0
  • Stop using old scan deprecated endpoints in python client. This logic will break client compatibility with server version <1.3.0
  • Remove the previous way to add labels through the dataset page. Now labels can be added only through dataset settings page.

As always, thanks to our amazing contributors!

v1.4.0

09 Mar 09:40
Compare
Choose a tag to compare

🔆 Highlights

Enhanced annotation flow for all tasks

Improved bulk annotation and actions

A more stylish banner for available global actions. It includes an improved label selector to apply and remove labels in bulk.

features-multiclass-bulk-labels

We enhanced multi-label text classification annotations and now adding labels in bulk doesn't remove previous labels. This action will change the status of the records to Pending and you will need to validate the annotation to save the changes.

Learn more about bulk annotations and multi-level text classification annotations in our docs.

Clear and Reset actions

New actions to clear all annotations and reset changes. They can be used at the record level or as bulk actions.

Unvalidate and undiscard

Click the Validate or Discard buttons in a record to undo this action.

Optimized one-record view

Improved view for a single record to enable a more focused annotation experience.

Prepare for training for SparkNLP Text2Text

Extended support to prepare Text2Text datasets for training with SparkNLP.

Learn more in our docs.

Extended shortcuts for token classification (kudos @cceyda)

In token classification tasks that have 10+ options, labels get assigned QWERTY keys as shortcuts.

Changelog

All notable changes to this project will be documented in this file. See standard-version for commit guidelines.

1.4.0 (2023-03-09)

Features

Bug Fixes

Documentation

As always, thanks to our amazing contributors!

  • Documentation update: adding missing n (#2362) by @Gnonpi
  • feat: Extend shortcuts to include alphabet for token classification (#2339) by @cceyda

v1.3.1

24 Feb 16:30
Compare
Choose a tag to compare

1.3.1 (2023-02-24)

Bug Fixes

  • quickstart: change default api key for the argilla quickstart image (#2357) (bb14f3c)

  • Resolve errors found in prepare_for_training during autotrain integration (#2411)
    Closes #2406
    Closes #2407
    Closes #2408
    Closes #2405

Documentation

v1.3.0

09 Feb 18:20
e55ea3e
Compare
Choose a tag to compare

🔆 Highlights

Keyword metric from Python client

Most important keywords in the dataset or a subset (using the query param) can be retrieved from Python. This can be useful for EDA and defining programmatic labeling rules:

from argilla.metrics.commons import keywords
summary = keywords(name="example-dataset")
summary.visualize() # will plot an histogram with results
summary.data # returns the raw result data

Prepare for training for SparkNLP and spaCy text-cat

Added a new framework sparknlp and extended the support for spacy including text classification datasets. Check out this section of the docs

Create train and test split with prepare_for_training

You can pass train_size and test_size to prepare_for_training to get train-test splits. This is especially useful for spaCy. Check out this section of the docs

Better repr for Dataset and Rule (kudos @Ankush-Chander)

When using the Python client now you get a human-readable visualization of Dataset and Rule entities

Changelog

All notable changes to this project will be documented in this file. See standard-version for commit guidelines.

1.3.0 (2023-02-09)

Features

Bug Fixes

  • Client: formatting caused offset in prediction (#2241) (d65db5a)
  • Client: Log remaining data when shutdown the dataset consumer (#2269) (d78963e), closes #2189
  • validate predictions fails on text2text (#2271) (f68856e), closes #2252

Visual enhancements

Documentation

As always, thanks to our amazing contributors!

v1.2.1

23 Jan 22:13
Compare
Choose a tag to compare

1.2.1 (2023-01-23)

Bug Fixes

v1.2.0

12 Jan 15:45
e98245d
Compare
Choose a tag to compare

1.2.0 (2023-01-12)

🔆 Highlights

Data labelling and curation with similarity search

Since 1.2.0 Argilla supports adding vectors to Argilla records which can then be used for finding the most similar records to a given one. This feature uses vector or semantic search combined with more traditional search (keyword and filter based).

4

View record info

You can now find all record details and fields which can be useful for bookmarking, copy/pasting, and making ES queries

Screenshot 2023-01-13 at 14 36 07

View record timestamp

You can now see the timestamp associated with the record timestamp (event timestamp) which corresponds to the moment when the record was uploaded or a custom timestamp passed when logging the data (e.g., when the prediction was made when using it for monitoring)

Configure the base path of your Argilla UI (useful for proxies)

See: https://docs.argilla.io/en/latest/getting_started/installation/server_configuration.html#using-a-proxy

Features

Bug Fixes

Documentation

As always, thanks to our amazing contributors!

v1.1.1

29 Nov 22:07
Compare
Choose a tag to compare

1.1.1 (2022-11-29)

Bug Fixes

Documentation