Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#6 Implement metric pipeline to analyze model performance #32

Merged
merged 13 commits into from
Sep 3, 2024

Conversation

botirk38
Copy link
Collaborator

Why?

We need to implement this feature to analyze the performance of our pipeline according to specified metrics. The primary use cases are:

  1. To evaluate the quality of reconstructed data compared to original data using standardized metrics (e.g., BLEU score).
  2. To provide a flexible and configurable way to apply different metrics to various data columns.
  3. To integrate metric analysis seamlessly into our SONAR pipelines for Hugging Face datasets.

This implementation will allow us to:

  • Quantitatively assess the performance of our data processing or model outputs.
  • Easily switch between different metrics for evaluation.
  • Identify areas where the pipeline's performance falls below a specified threshold.

How?

Key technical decisions made in this implementation:

  1. Created a MetricPipelineConfig class extending PipelineConfig to encapsulate metric-specific configuration.
  2. Implemented a MetricAnalyzerPipeline class that inherits from the base Pipeline class.
  3. Utilized the Hugging Face evaluate library to load and compute metrics.
  4. Implemented batch processing to efficiently handle large datasets.
  5. Added logging throughout the pipeline for better monitoring and debugging.
  6. Used type hints and dataclasses for improved code readability and maintainability.

Key components:

  • MetricOverwrites: Allows for easy overriding of metric-specific configurations.
  • compute_metric: Calculates the metric score for a given set of original and reconstructed data.
  • process_batch: Applies the metric computation to each specified column in a batch.
  • __call__: Processes the entire dataset using the configured metric.

Work in Progress (WIP) parts:

  • Error handling could be expanded, especially in the process_batch method.
  • The results attribute in MetricAnalyzerPipeline is initialized but not used in the current implementation.

Test plan

To test these changes, we will:

  1. Create unit tests for individual components of the MetricAnalyzerPipeline.
  2. Implement integration tests to ensure proper functioning within the SONAR pipeline ecosystem.
  3. Test with various metrics and dataset types to ensure flexibility and robustness.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 19, 2024
@botirk38 botirk38 changed the title Implement metric pipeline to analyze model performance #5 Implement metric pipeline to analyze model performance Aug 2, 2024
@botirk38 botirk38 changed the title #5 Implement metric pipeline to analyze model performance #6 Implement metric pipeline to analyze model performance Aug 12, 2024
Copy link
Contributor

@antoine-tran antoine-tran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good to me. I think some code needs to be improved but once the comments are addressed (and make linter happy ofc), we can land this

huggingface_pipelines/metric_analyzer.py Outdated Show resolved Hide resolved
huggingface_pipelines/metric_analyzer.py Outdated Show resolved Hide resolved
@botirk38 botirk38 merged commit 9004185 into facebookresearch:main Sep 3, 2024
4 of 5 checks passed
artemru added a commit that referenced this pull request Sep 6, 2024
artemru added a commit that referenced this pull request Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants