Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(machine_learning_patterns.md): add enterprise ml patterns tutorial (#1696) #1704

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

philip-ndikum
Copy link

What type of PR is this? (check all applicable)

  • πŸ’Ύ Refactor
  • ✨ Feature
  • πŸ› Bug Fix
  • πŸ”§ Optimization
  • πŸ“ Documentation
  • βœ… Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

Description

This PR adds a comprehensive tutorial on enterprise-grade machine learning patterns in Narwhals. The tutorial consolidates backend-agnostic patterns that are essential for production ML workflows.

Patterns Covered

  1. Data Validation Patterns

    • Eager validation for immediate feedback
    • Lazy validation for optimized workflows
    • Development vs production modes with pass_through
  2. Time Series Operations

    • Group-level temporal validations
    • Efficient duplicate detection
    • Lazy evaluation for large datasets
  3. Feature Engineering

    • Backend-agnostic transformations
    • Memory-efficient computations
    • Type-safe operations with proper casting
  4. Environment Management

    • Lean production setup
    • Comprehensive testing environment
    • Backend-specific dependency management

Additional Context

  • A set of companion notebooks demonstrating these patterns has been added to TemporalScope
  • The tutorial is designed to be extensible, allowing for additional patterns to be added as the community identifies more common use cases
  • All code examples are tested across Pandas, Polars, and Dask backends

The tutorial aims to provide clear, actionable guidance for developers building robust ML pipelines with Narwhals, while maintaining backend independence and performance optimization.

philip-ndikum and others added 2 commits January 2, 2025 00:36
add tutorial covering:

data validation with eager/lazy evaluation
time series operations and validation
feature engineering with backend-agnostic transformations
environment management for production/testing
closes narwhals-dev#1696
@MarcoGorelli
Copy link
Member

thanks a tonne for your pr! will take a look in the week, looks really good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Doc]: Enhance Narwhals Tutorials with Backend-Agnostic Patterns
2 participants