Skip to content

#6 Implement metric pipeline to analyze model performance#32

Merged
botirk38 merged 13 commits intofacebookresearch:mainfrom
botirk38:feature/metric-analyzer
Sep 3, 2024
Merged

#6 Implement metric pipeline to analyze model performance#32
botirk38 merged 13 commits intofacebookresearch:mainfrom
botirk38:feature/metric-analyzer

Conversation

@botirk38
Copy link
Copy Markdown
Contributor

Why?

We need to implement this feature to analyze the performance of our pipeline according to specified metrics. The primary use cases are:

  1. To evaluate the quality of reconstructed data compared to original data using standardized metrics (e.g., BLEU score).
  2. To provide a flexible and configurable way to apply different metrics to various data columns.
  3. To integrate metric analysis seamlessly into our SONAR pipelines for Hugging Face datasets.

This implementation will allow us to:

  • Quantitatively assess the performance of our data processing or model outputs.
  • Easily switch between different metrics for evaluation.
  • Identify areas where the pipeline's performance falls below a specified threshold.

How?

Key technical decisions made in this implementation:

  1. Created a MetricPipelineConfig class extending PipelineConfig to encapsulate metric-specific configuration.
  2. Implemented a MetricAnalyzerPipeline class that inherits from the base Pipeline class.
  3. Utilized the Hugging Face evaluate library to load and compute metrics.
  4. Implemented batch processing to efficiently handle large datasets.
  5. Added logging throughout the pipeline for better monitoring and debugging.
  6. Used type hints and dataclasses for improved code readability and maintainability.

Key components:

  • MetricOverwrites: Allows for easy overriding of metric-specific configurations.
  • compute_metric: Calculates the metric score for a given set of original and reconstructed data.
  • process_batch: Applies the metric computation to each specified column in a batch.
  • __call__: Processes the entire dataset using the configured metric.

Work in Progress (WIP) parts:

  • Error handling could be expanded, especially in the process_batch method.
  • The results attribute in MetricAnalyzerPipeline is initialized but not used in the current implementation.

Test plan

To test these changes, we will:

  1. Create unit tests for individual components of the MetricAnalyzerPipeline.
  2. Implement integration tests to ensure proper functioning within the SONAR pipeline ecosystem.
  3. Test with various metrics and dataset types to ensure flexibility and robustness.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 19, 2024
@botirk38 botirk38 changed the title Implement metric pipeline to analyze model performance #5 Implement metric pipeline to analyze model performance Aug 2, 2024
@botirk38 botirk38 changed the title #5 Implement metric pipeline to analyze model performance #6 Implement metric pipeline to analyze model performance Aug 12, 2024
Copy link
Copy Markdown
Contributor

@antoine-tran antoine-tran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good to me. I think some code needs to be improved but once the comments are addressed (and make linter happy ofc), we can land this

@botirk38 botirk38 merged commit 9004185 into facebookresearch:main Sep 3, 2024
artemru added a commit that referenced this pull request Sep 6, 2024
artemru added a commit that referenced this pull request Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants