You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Narwhals tutorials could be significantly enhanced by consolidating backend-agnostic patterns into a single, robust tutorial tailored for enterprise-grade machine learning (ML) and artificial intelligence (AI) workflows. This tutorial would focus on practical, production-ready patterns that simplify common tasks, ensure backend consistency, and align with scalable development workflows.
Key Focus Areas:
Data Validation Patterns:
Eager validation for immediate feedback (e.g., numeric and categorical feature validation).
Lazy validation for optimized workflows across larger datasets.
Time Series Operations:
Group-level metrics such as temporal aggregations (mean, null counts) for time-indexed datasets.
Temporal validation for uniqueness, consistency, and handling mixed frequencies.
Feature Engineering:
Backend-agnostic numeric and categorical transformations.
Patterns for missing value imputation, standardization, and case consistency.
In our package TemporalScope, which leverages Narwhals for model-agnostic explainability in AI/ML workflows, these patterns would be immensely valuable for ensuring robust data preparation and validation. Specifically:
Use Case:
Validating and transforming features across Pandas, Polars, and Dask backends for explainable ML workflows.
Handling time series data in both single-step and multi-step forecasting pipelines.
Development Workflow:
Lean Main Environment:
A hatch environment limited to Narwhals, without heavy dependencies like Pandas or Dask.
Comprehensive Test Environment:
A hatch environment including all relevant libraries (Pandas, Polars, Dask) to validate runtime behavior and backend-agnostic patterns.
By integrating these patterns into a single, enterprise-grade tutorial, Narwhals would provide developers with clear, actionable guidance for robust AI/ML workflows [CC @kanenorman].
Suggestion
Create a condensed tutorial notebook that demonstrates these patterns, building directly on the feedback shared:
Universal Backend Support:
Showcase compatibility with Pandas, Polars, and Dask.
Core Narwhals Patterns:
Focus on the use of @nw.narwhalify, lazy/eager evaluation strategies, and backend-agnostic transformations.
Production-Ready Use Cases:
Condense practical examples that are directly applicable to AI/ML pipelines, following @FBruzzesi recommendations (e.g., using pass_through=True or strict=False where necessary).
If you have a suggestion on how it should be, add it below.
No response
The text was updated successfully, but these errors were encountered:
add tutorial covering:
data validation with eager/lazy evaluation
time series operations and validation
feature engineering with backend-agnostic transformations
environment management for production/testing
closesnarwhals-dev#1696
What type of report is this?
Correction
Please describe the issue.
Description
Narwhals tutorials could be significantly enhanced by consolidating backend-agnostic patterns into a single, robust tutorial tailored for enterprise-grade machine learning (ML) and artificial intelligence (AI) workflows. This tutorial would focus on practical, production-ready patterns that simplify common tasks, ensure backend consistency, and align with scalable development workflows.
Key Focus Areas:
Data Validation Patterns:
Time Series Operations:
Feature Engineering:
In our package TemporalScope, which leverages Narwhals for model-agnostic explainability in AI/ML workflows, these patterns would be immensely valuable for ensuring robust data preparation and validation. Specifically:
A
hatch
environment limited to Narwhals, without heavy dependencies like Pandas or Dask.A
hatch
environment including all relevant libraries (Pandas, Polars, Dask) to validate runtime behavior and backend-agnostic patterns.By integrating these patterns into a single, enterprise-grade tutorial, Narwhals would provide developers with clear, actionable guidance for robust AI/ML workflows [CC @kanenorman].
Suggestion
Create a condensed tutorial notebook that demonstrates these patterns, building directly on the feedback shared:
Showcase compatibility with Pandas, Polars, and Dask.
Focus on the use of
@nw.narwhalify
, lazy/eager evaluation strategies, and backend-agnostic transformations.Condense practical examples that are directly applicable to AI/ML pipelines, following @FBruzzesi recommendations (e.g., using
pass_through=True
orstrict=False
where necessary).If you have a suggestion on how it should be, add it below.
No response
The text was updated successfully, but these errors were encountered: