Skip to content

This package simplifies the workflow of the 'validate' package by providing tools to manage the usage of 'confront', 'validator', 'violating', and 'summary' functions. It also streamlines the integration with base R's 'stop' and 'warning' functions, enhancing error handling and reporting.

License

Notifications You must be signed in to change notification settings

AngelFelizR/tidyvalidate

Repository files navigation

tidyvalidate tidyvalidate website

Lifecycle: experimental Codecov test coverage R-CMD-check

Overview

tidyvalidate simplifies data validation in R by providing an intuitive interface to the powerful validate package. It helps ensure data quality by making it easy to:

  • Write clear, expressive validation rules
  • Check both column-level and row-level conditions
  • Get detailed reports of validation failures
  • Integrate validation checks into your data pipeline

The package streamlines common validation tasks while leveraging the robust foundation of the validate package and R’s error handling system.

Installation

You can install the development version of tidyvalidate from GitHub:

# Install pak if you haven't already
# install.packages("pak")

# Install tidyvalidate
pak::pak("AngelFelizR/tidyvalidate")

Quick Start

Basic Validation

Let’s validate some data from the built-in mtcars dataset:

library(tidyvalidate)

# Define and run validations
validation_results <- validate_rules(
  mtcars,
  # Column type validations
  mpg_is_numeric = is.numeric(mpg),
  hp_is_numeric = is.numeric(hp),
  
  # Business rule validation
  mpg_minimum = mpg > 15
)

# View results
validation_results
#> $summary
#>              name items passes fails   nNA  error warning
#>            <char> <int>  <int> <int> <int> <lgcl>  <lgcl>
#> 1: mpg_is_numeric     1      1     0     0  FALSE   FALSE
#> 2:  hp_is_numeric     1      1     0     0  FALSE   FALSE
#> 3:    mpg_minimum    32     26     6     0  FALSE   FALSE
#> 
#> $row_level_errors
#> $row_level_errors$mpg_minimum
#>    Broken Rule   mpg
#>         <char> <num>
#> 1: mpg_minimum  14.3
#> 2: mpg_minimum  10.4
#> 3: mpg_minimum  10.4
#> 4: mpg_minimum  14.7
#> 5: mpg_minimum  13.3
#> 6: mpg_minimum  15.0

The results show: - A summary of all validations - Detailed information about which rows failed the mpg_minimum check

Taking Action on Validation Failures

You can automatically handle validation failures in two ways:

1. Stop Execution on Failure

try({
  validate_rules(mtcars,
    mpg_minimum = mpg > 15
  ) |>
    action_if_problem(
      "Critical: Found cars with MPG below minimum threshold"
    )
})
#> [1] "Critical: Found cars with MPG below minimum threshold"
#>           name items passes fails   nNA  error warning
#>         <char> <int>  <int> <int> <int> <lgcl>  <lgcl>
#> 1: mpg_minimum    32     26     6     0  FALSE   FALSE
#> Error in action_if_problem(validate_rules(mtcars, mpg_minimum = mpg >  : 
#>   Critical: Found cars with MPG below minimum threshold

2. Continue with Warning

validation_results <- validate_rules(mtcars,
    mpg_minimum = mpg > 15
  ) |>
  action_if_problem(
    "Advisory: Some cars have low MPG values",
    problem_action = "warning"
  )
#> [1] "Advisory: Some cars have low MPG values"
#>           name items passes fails   nNA  error warning
#>         <char> <int>  <int> <int> <int> <lgcl>  <lgcl>
#> 1: mpg_minimum    32     26     6     0  FALSE   FALSE
#> Warning in action_if_problem(validate_rules(mtcars, mpg_minimum = mpg > :
#> Advisory: Some cars have low MPG values

Key Features

  • Simple Interface: Write validation rules using familiar R syntax
  • Comprehensive Results: Get both summary statistics and row-level details
  • Flexible Actions: Choose between warnings and errors based on severity
  • Pipeline Integration: Works seamlessly with the pipe operator
  • Detailed Reporting: Identify exactly which rows failed validation

Learn More

About

This package simplifies the workflow of the 'validate' package by providing tools to manage the usage of 'confront', 'validator', 'violating', and 'summary' functions. It also streamlines the integration with base R's 'stop' and 'warning' functions, enhancing error handling and reporting.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages