From 23994e730f5460d6a6e3dd642dce1605ae270153 Mon Sep 17 00:00:00 2001 From: Miha Zgubic Date: Mon, 16 May 2022 16:24:57 +0100 Subject: [PATCH] mention fitting transforms in the docs --- docs/src/examples.md | 2 ++ docs/src/transforms.md | 15 +++++++++++++-- 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/docs/src/examples.md b/docs/src/examples.md index 0ca7392..d2c307c 100644 --- a/docs/src/examples.md +++ b/docs/src/examples.md @@ -7,6 +7,8 @@ First we load some hourly weather data: ```jldoctest example julia> using DataFrames, Dates, FeatureTransforms +julia> using FeatureTransforms: fit! + julia> df = DataFrame( :time => DateTime(2018, 9, 10):Hour(1):DateTime(2018, 9, 10, 23), :temperature => [10.6, 9.5, 8.9, 8.9, 8.4, 8.4, 7.7, 8.9, 11.7, 13.9, 16.2, 17.7, 18.9, 20.0, 21.2, 21.7, 21.7, 21.2, 20.0, 18.4, 16.7, 15.0, 13.9, 12.7], diff --git a/docs/src/transforms.md b/docs/src/transforms.md index 12a8b41..72d09a8 100644 --- a/docs/src/transforms.md +++ b/docs/src/transforms.md @@ -2,12 +2,14 @@ A `Transform` defines a transformation of data for feature engineering purposes. Some examples are scaling, periodic functions, linear combination, and one-hot encoding. +Transforms can be stateless, for example the power transform, or they can be stateful and fit to the data, such as the [`StandardScaling`](@ref). ```@meta DocTestSetup = quote using DataFrames using Dates using FeatureTransforms + using FeatureTransforms: fit! end ``` @@ -20,6 +22,15 @@ For example, the following defines a squaring operation (i.e. raise to the power julia> p = Power(2); ``` +A stateful transform, such as a [`StandardScaling`](@ref) should also be fit to the data before it is applied: +```julia-repl +julia> s = StandardScaling(); + +julia> x = rand(5); + +julia> FeatureTransforms.fit(s, x); +``` + ## Methods to apply a transform Given some data `x`, there are three main methods to apply a transform. @@ -147,7 +158,7 @@ julia> M julia> normalize_row = StandardScaling(); julia> fit!(normalize_row, M; dims=1, inds=[2]) -StandardScaling(3.0, 2.8284271247461903, true) +StandardScaling(3.0, 2.8284271247461903) julia> normalize_row(M; dims=1, inds=[2]) 1×2 Matrix{Float64}: @@ -156,7 +167,7 @@ julia> normalize_row(M; dims=1, inds=[2]) julia> normalize_col = StandardScaling(); julia> fit!(normalize_col, M; dims=2, inds=[2]) -StandardScaling(5.0, 1.0, true) +StandardScaling(5.0, 1.0) julia> normalize_col(M; dims=2, inds=[2]) 3×1 Matrix{Float64}: