#6 Implement metric pipeline to analyze model performance #32

botirk38 · 2024-07-19T00:43:02Z

Why?

We need to implement this feature to analyze the performance of our pipeline according to specified metrics. The primary use cases are:

To evaluate the quality of reconstructed data compared to original data using standardized metrics (e.g., BLEU score).
To provide a flexible and configurable way to apply different metrics to various data columns.
To integrate metric analysis seamlessly into our SONAR pipelines for Hugging Face datasets.

This implementation will allow us to:

Quantitatively assess the performance of our data processing or model outputs.
Easily switch between different metrics for evaluation.
Identify areas where the pipeline's performance falls below a specified threshold.

How?

Key technical decisions made in this implementation:

Created a MetricPipelineConfig class extending PipelineConfig to encapsulate metric-specific configuration.
Implemented a MetricAnalyzerPipeline class that inherits from the base Pipeline class.
Utilized the Hugging Face evaluate library to load and compute metrics.
Implemented batch processing to efficiently handle large datasets.
Added logging throughout the pipeline for better monitoring and debugging.
Used type hints and dataclasses for improved code readability and maintainability.

Key components:

MetricOverwrites: Allows for easy overriding of metric-specific configurations.
compute_metric: Calculates the metric score for a given set of original and reconstructed data.
process_batch: Applies the metric computation to each specified column in a batch.
__call__: Processes the entire dataset using the configured metric.

Work in Progress (WIP) parts:

Error handling could be expanded, especially in the process_batch method.
The results attribute in MetricAnalyzerPipeline is initialized but not used in the current implementation.

Test plan

To test these changes, we will:

Create unit tests for individual components of the MetricAnalyzerPipeline.
Implement integration tests to ensure proper functioning within the SONAR pipeline ecosystem.
Test with various metrics and dataset types to ensure flexibility and robustness.

huggingface_pipelines/metric_analyzer.py

…iple metrics to be passed

antoine-tran

This PR looks good to me. I think some code needs to be improved but once the comments are addressed (and make linter happy ofc), we can land this

huggingface_pipelines/metric_analyzer.py

This reverts commit 9004185.

…" (#39) This reverts commit 9004185.

Add metric analyzer class

1ef7185

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 19, 2024

artemru reviewed Jul 30, 2024

View reviewed changes

huggingface_pipelines/metric_analyzer.py Outdated Show resolved Hide resolved

Bugfixes in relation to formatting columns for metrics and allow mult…

8bda976

…iple metrics to be passed

botirk38 changed the title ~~Implement metric pipeline to analyze model performance~~ #5 Implement metric pipeline to analyze model performance Aug 2, 2024

botirk38 changed the title ~~#5 Implement metric pipeline to analyze model performance~~ #6 Implement metric pipeline to analyze model performance Aug 12, 2024

antoine-tran approved these changes Aug 14, 2024

View reviewed changes

huggingface_pipelines/metric_analyzer.py Outdated Show resolved Hide resolved

huggingface_pipelines/metric_analyzer.py Outdated Show resolved Hide resolved

botirk38 added 11 commits August 14, 2024 11:03

Remove joining functionality we can enforce List[str] to be passed

61c6e96

Improve metric pipeline durability for different data structures

f75d9fe

Fix linting issues

a39e018

Check length of column and reconstructed column to ensure they match

14454f4

Add unit tests for metric analyzer

c937e04

Add unit tests for metric analyzer

0b77b3a

Fix linting issues

d9a7d4b

Use load from evaulate instead of datasets

8fd649e

Add tests for metric analyzer

a7abdc5

Fix linting issues

0e0f4bd

Fix mypy inheritance issue

8d7af5b

botirk38 merged commit 9004185 into facebookresearch:main Sep 3, 2024
4 of 5 checks passed

artemru added a commit that referenced this pull request Sep 6, 2024

Revert "#6 Implement metric pipeline to analyze model performance (#32)"

75f5c86

This reverts commit 9004185.

artemru added a commit that referenced this pull request Sep 6, 2024

Revert "#6 Implement metric pipeline to analyze model performance (#32)…

dbdcb7c

…" (#39) This reverts commit 9004185.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#6 Implement metric pipeline to analyze model performance #32

#6 Implement metric pipeline to analyze model performance #32

botirk38 commented Jul 19, 2024

antoine-tran left a comment •

edited

Loading

#6 Implement metric pipeline to analyze model performance #32

#6 Implement metric pipeline to analyze model performance #32

Conversation

botirk38 commented Jul 19, 2024

Why?

How?

Test plan

antoine-tran left a comment • edited Loading

Choose a reason for hiding this comment

antoine-tran left a comment •

edited

Loading