30 Mar 17:30

frascuchon

3e2572f

v1.4.1

1.4.1

Bug Fixes

Copying datasets between workspaces with proper owner/workspace info. Closes #2562
Copy dataset with empty workspace to the default user workspace 905d4de
Using elasticsearch config to request backend version. Closes #2311

Assets 2

30 Mar 16:29

frascuchon

v1.3.2

59bb5ff

v1.3.2

1.3.2

Bug Fixes

Copying datasets between workspaces with proper owner/workspace info. Closes #2562
Copy dataset with empty workspace to the default user workspace 905d4de
Using elasticsearch config to request backend version. Closes #2311

Assets 2

30 Mar 16:29

frascuchon

v1.2.2

cd86e2d

v1.2.2

1.2.2

Bug Fixes

Copying datasets between workspaces with proper owner/workspace info. Closes #2562
Copy dataset with empty workspace to the default user workspace 905d4de
Using elasticsearch config to request backend version. Closes #2311

Assets 2

22 Mar 16:19

frascuchon

v1.5.0

7062819

v.1.5.0

🔆 Highlights

Dataset Settings page

We have added a Settings page for your datasets. From there, you will be able to manage your dataset. Currently, it is possible to add labels to your labeling schema and delete the dataset.

Add images to your records

The image in this record was generated using https://robohash.org

You can pass a URL in the metadata field _image_url and the image will be rendered in the Argilla UI. You can use this in the Text Classification and the Token Classification tasks.

Non-searchable metadata fields

Apart from the _image_url field you can also pass other metadata fields that won't be used in queries or filters by adding an underscore at the start e.g. _my_field.

Load only what you need using `rg.load`

You can now specify the fields you want to load from your Argilla dataset. That way, you can avoid loading heavy vectors if you're using them for your annotations.

Two new tutorials (kudos @embonhomme & @burtenshaw)

Check out our new tutorials created by the community!

Compare the performance of two text classification models here
Multimodal bulk annotation here

Changelog

All notable changes to this project will be documented in this file. See standard-version for commit guidelines.

1.5.0 - 2023-03-21

Added

Add the fields to retrieve when loading the data from argilla. rg.load takes too long because of the vector field, even when users don't need it. Closes #2398
Add new page and components for dataset settings. Closes #2442
Add ability to show image in records (for TokenClassification and TextClassification) if an URL is passed in metadata with the key _image_url
Non-searchable fields support in metadata. #2570

Changed

Labels are now centralized in a specific vuex ORM called GlobalLabel Model, see #2210. This model is the same for TokenClassification and TextClassification (so both task have labels with color_id and shortcuts parameters in the vuex ORM)
The shortcuts improvement for labels #2339 have been moved to the vuex ORM in dataset settings feature #2444
Update "Define a labeling schema" section in docs.
The record inputs are sorted alphabetically in UI by default. #2581

Fixes

Allow URL to be clickable in Jupyter notebook again. Closes #2527

Removed

Removing some data scan deprecated endpoints used by old clients. This change will break compatibility with client <v1.3.0
Stop using old scan deprecated endpoints in python client. This logic will break client compatibility with server version <1.3.0
Remove the previous way to add labels through the dataset page. Now labels can be added only through dataset settings page.

As always, thanks to our amazing contributors!

Documentation update: tutorial for text classification models comparison (#2426) by @embonhomme
Docs: fix little typo (#2522) by @anakin87
Docs: Tutorial on image classification (#2420) by @burtenshaw

Contributors

burtenshaw, anakin87, and embonhomme

Assets 2

09 Mar 09:40

frascuchon

v1.4.0

3a91bbc

v1.4.0

🔆 Highlights

Enhanced annotation flow for all tasks

Improved bulk annotation and actions

A more stylish banner for available global actions. It includes an improved label selector to apply and remove labels in bulk.

We enhanced multi-label text classification annotations and now adding labels in bulk doesn't remove previous labels. This action will change the status of the records to Pending and you will need to validate the annotation to save the changes.

Learn more about bulk annotations and multi-level text classification annotations in our docs.

Clear and Reset actions

New actions to clear all annotations and reset changes. They can be used at the record level or as bulk actions.

Unvalidate and undiscard

Click the Validate or Discard buttons in a record to undo this action.

Optimized one-record view

Improved view for a single record to enable a more focused annotation experience.

Prepare for training for SparkNLP Text2Text

Extended support to prepare Text2Text datasets for training with SparkNLP.

Learn more in our docs.

Extended shortcuts for token classification (kudos @cceyda)

In token classification tasks that have 10+ options, labels get assigned QWERTY keys as shortcuts.

Changelog

All notable changes to this project will be documented in this file. See standard-version for commit guidelines.

1.4.0 (2023-03-09)

Features

configure_dataset accepts a workspace as argument (#2503) (29c9ee3),
Add active_client function to main argilla module (#2387) (4e623d4), closes #2183
Add text2text support for prepare for training spark nlp (#2466) (21efb83), closes #2465 #2482
Allow passing workspace as client param for rg.log or rg.load (#2425) (b3b897a), closes #2059
Bulk annotation improvement (#2437) (3fce915), closes #2264
Deprecate chunk_size in favor of batch_size for rg.log (#2455) (3ebea76), closes #2453
Expose batch_size parameter for rg.load (#2460) (e25be3e), closes #2454 #2434
Extend shortcuts to include alphabet for token classification (#2339) (4a92b35)

Bug Fixes

added flexible app redirect to docs page (#2428) (5600301), closes #2377
added regex match to set workspace method (#2427) (d789fa1), closes [#2388]
error when loading record with empty string query (#2429) (fc71c3b), closes #2400 #2303
Remove extra-action dropdown state after navigation (#2479) (9328994), closes #2158

Documentation

Add AutoTrain to readme (7199780)
Add migration to label schema section (#2435) (d57a1e5), closes #2003 #2003
Adds zero+few shot tutorial with SetFit (#2409) (6c679ad)
Update readme with quickstart section and new links to guides (#2333) (91a77ad)

As always, thanks to our amazing contributors!

Documentation update: adding missing n (#2362) by @Gnonpi
feat: Extend shortcuts to include alphabet for token classification (#2339) by @cceyda

Contributors

cceyda and Gnonpi

Assets 2

24 Feb 16:30

frascuchon

v1.3.1

2b66a6f

v1.3.1

1.3.1 (2023-02-24)

Bug Fixes

quickstart: change default api key for the argilla quickstart image (#2357) (bb14f3c)
Resolve errors found in prepare_for_training during autotrain integration (#2411)
Closes #2406
Closes #2407
Closes #2408
Closes #2405

Documentation

Add section from empty workspaces migration (#2382) (d0f8882), Refs #2373

Assets 2

09 Feb 18:20

frascuchon

v1.3.0

e55ea3e

v1.3.0

🔆 Highlights

Keyword metric from Python client

Most important keywords in the dataset or a subset (using the query param) can be retrieved from Python. This can be useful for EDA and defining programmatic labeling rules:

from argilla.metrics.commons import keywords
summary = keywords(name="example-dataset")
summary.visualize() # will plot an histogram with results
summary.data # returns the raw result data

Prepare for training for SparkNLP and spaCy text-cat

Added a new framework sparknlp and extended the support for spacy including text classification datasets. Check out this section of the docs

Create train and test split with prepare_for_training

You can pass train_size and test_size to prepare_for_training to get train-test splits. This is especially useful for spaCy. Check out this section of the docs

Better repr for Dataset and Rule (kudos @Ankush-Chander)

When using the Python client now you get a human-readable visualization of Dataset and Rule entities

Changelog

All notable changes to this project will be documented in this file. See standard-version for commit guidelines.

1.3.0 (2023-02-09)

Features

better log error handling (#2245) (66e5cce), closes #2005
Change view mode order in sidebar (#2215) (dff1ea1), closes #2214
Client: Expose keywords dataset metrics (#2290) (a945c5e), closes #2135
Client: relax client constraints for rules management (#2242) (6e749b7), closes #2048
Create a multiple contextual help component (#2255) (a35fae2), closes #1926
Include record event_timestamp (#2156) (3992b8f), closes #1911
updated the prepare_for_training methods (#2225) (e53c201), closes #2154 #2132 #2122 #2045 #1697

Bug Fixes

Client: formatting caused offset in prediction (#2241) (d65db5a)
Client: Log remaining data when shutdown the dataset consumer (#2269) (d78963e), closes #2189
validate predictions fails on text2text (#2271) (f68856e), closes #2252

Visual enhancements

Fine tune menu record card (#2240) (62148e5), closes #2224
Rely on box-shadow to provide the secondary underline (#2283) (d786171), closes #2282 #2282

Documentation

Add deploy on Spaces buttons (#2293) (60164a0)
fix typo in documentation (#2296) (ab8e85e)
Improve deployment and quickstart docs and tutorials (#2201) (075bf94), closes #2162
More spaces! (#2309) (f02eb60)
Remove cut-off sentence in docs codeblock (#2287) (7e87f20)
Rephrase to know more into to learn more in Quickstart login page (#2305) (6082a26)
Replace leftover rubrix.apikey with argilla.apikey (#2286) (4871127), closes #2254 #2254
Simplify token attributions code block (#2322) (4cb6ae1)
Tutorial buttons (#2310) (d6e02de)
Update colab guide (#2320) (e48a7cc)
Update HF Spaces creation image (#2314) (e4b2a04)

As always, thanks to our amazing contributors!

add repr method for Rule, Dataset. (#2148) by @Ankush-Chander
opensearch docker compose file doesn't run (#2228) by @kayvane1
Docs: fix typo in documentation (#2296) by @anakin87

Contributors

Ankush-Chander, kayvane1, and anakin87

Assets 2

23 Jan 22:13

frascuchon

v1.2.1

f39f80d

v1.2.1

1.2.1 (2023-01-23)

Bug Fixes

Allow non-alphanumeric characters for login (#2207) (629499a), closes #1879
Client: Stop using ujson for client actions (#2211) (920213e)
doc typos (#2203) (b353a30)
Read statics with proper encoding (#2234) (92739bf), closes #2219
Remove 3.9+ string methods (#2230) (4ed1ff0), closes #2192
Remove argilla:stats in metadata filter (#2218) (a412b22), closes #2217, #2220

Assets 2

12 Jan 15:45

frascuchon

v1.2.0

e98245d

v1.2.0

1.2.0 (2023-01-12)

🔆 Highlights

Data labelling and curation with similarity search

Since 1.2.0 Argilla supports adding vectors to Argilla records which can then be used for finding the most similar records to a given one. This feature uses vector or semantic search combined with more traditional search (keyword and filter based).

View record info

You can now find all record details and fields which can be useful for bookmarking, copy/pasting, and making ES queries

View record timestamp

You can now see the timestamp associated with the record timestamp (event timestamp) which corresponds to the moment when the record was uploaded or a custom timestamp passed when logging the data (e.g., when the prediction was made when using it for monitoring)

Configure the base path of your Argilla UI (useful for proxies)

See: https://docs.argilla.io/en/latest/getting_started/installation/server_configuration.html#using-a-proxy

Features

Allow to launch the argilla server in a different base_url (#2080) (63d624d), closes #1914 #1899
Check es connection on startup with retries (#2141) (7a63bea)
enable partial record update (#2118) (4ed0d95)
Improve the dataset_labels metric processing (#1978) (1c3235e), closes #1818
Include record event_timestamp (#2156) (5b75ade), closes #1911
Include record info view and remove metadata filter (#2079) (901d45a), closes #1927 #1849
Raw records scan endpoint (#2102) (1b63d95)
reuse the same httpx async client instance (#1958) (a70cb6c), closes #1886
Search: Allow passing raw es query in search query (#2098) (0541798)
set record timestamp by default (#1970) (309fd9f), closes #1892
Similarity vector search (#1768) (#1998) (32958f4), closes #1757
UI: remove mixins to hide scroll bar in drop down (#2000) (95ad9b8), closes #1928

Bug Fixes

#1912 hide empty menu dropdown (#1981) (d90390b)
Avoid manipulating DOM (#1895) (6939b28), closes #1765
catch ImportError for telemetry module (#1989) (25513b7)
Client: check url underscore only for hostnames (#2185) (ec5726a)
client: prevent python client response json parse error (#2186) (5549ab0)
Compute predicted properly for token classification [REINDEX_DATASET_REF] (#1975) (a29a198), closes #1955
Disable shortcuts for pagination when focus is on an input tag (#1995) (af07f3e), closes #1976
Migration: Set dynamic to false for old indices (#2167) (15a18d7)
Prevent show "No result" before data is loaded (#2014) (0799425), closes #1936

Documentation

Add new tutorial about zeroshot sentiment analysis with GPT-3 (#2011) (d3c43ab)
added additional explanation for datetime ranges (#2120) (c8c3dc9), closes #2119
Adds Hugging Face Space deployment guide (#2109) (a7a47c4)
changed DatasetForTextGeneration to DatasetForText2Text (#2090) (8cde28b), closes #2089
Fix load docstring example (#2050) (7e2af7f), closes #1951
fixed typo errors for terminology section (#2025) (1056736)
include new OG image (#2017) (710ab3f)
Include og image (#2016) (85442e4)
Maintain menu position during navigation (#1935) (82c6e08), closes #1864
New setfit tutorial (#2002) (43c66b2)
Replace OG image (#2018) (894b273)
Replace video with image (#1990) (359b637)
reverted to correct apikey reference (#2136) (f32f2b8), closes #2074

As always, thanks to our amazing contributors!

Add Azure deployment tutorial (#2124) by @burtenshaw
Create training-textclassification-activelearning-with-GPU.ipynb (#2020) by @MoritzLaurer

Contributors

burtenshaw and MoritzLaurer

Assets 2

29 Nov 22:07

frascuchon

v1.1.1

a961fdb

v1.1.1

1.1.1 (2022-11-29)

Bug Fixes

Set proper telemetry version (#1988) (d302891)

Documentation

Fix metric function imports in the example (#1966) (a1f6f6e), closes #1962

Assets 2

Releases: argilla-io/argilla

v1.4.1

1.4.1

Bug Fixes

v1.3.2

1.3.2

Bug Fixes

v1.2.2

1.2.2

Bug Fixes

v.1.5.0

🔆 Highlights

Dataset Settings page

Add images to your records

Non-searchable metadata fields

Load only what you need using rg.load

Two new tutorials (kudos @embonhomme & @burtenshaw)

Changelog

1.5.0 - 2023-03-21

Added

Changed

Fixes

Removed

As always, thanks to our amazing contributors!

Contributors

v1.4.0

🔆 Highlights

Enhanced annotation flow for all tasks

Improved bulk annotation and actions

Clear and Reset actions

Unvalidate and undiscard

Optimized one-record view

Prepare for training for SparkNLP Text2Text

Extended shortcuts for token classification (kudos @cceyda)

Changelog

1.4.0 (2023-03-09)

Features

Bug Fixes

Documentation

As always, thanks to our amazing contributors!

Contributors

v1.3.1

1.3.1 (2023-02-24)

Bug Fixes

Documentation

v1.3.0

🔆 Highlights

Keyword metric from Python client

Prepare for training for SparkNLP and spaCy text-cat

Create train and test split with prepare_for_training

Better repr for Dataset and Rule (kudos @Ankush-Chander)

Changelog

1.3.0 (2023-02-09)

Features

Bug Fixes

Visual enhancements

Documentation

As always, thanks to our amazing contributors!

Contributors

v1.2.1

1.2.1 (2023-01-23)

Bug Fixes

v1.2.0

1.2.0 (2023-01-12)

🔆 Highlights

Data labelling and curation with similarity search

View record info

View record timestamp

Configure the base path of your Argilla UI (useful for proxies)

Features

Bug Fixes

Documentation

As always, thanks to our amazing contributors!

Contributors

v1.1.1

1.1.1 (2022-11-29)

Bug Fixes

Documentation

Load only what you need using `rg.load`