Skip to content

Formulas don't seem to be searched for globals #87

@DavisVaughan

Description

@DavisVaughan

First seen here futureverse/furrr#256

I've reduced that down to this minimal ish example. It seems like formula objects aren't searched in for globals? I'm not quite sure. If you do decide to look in formula objects for globals, it will be important to ensure that you look in the formula's environment for those globals rather than the standard envir argument.

The only way I found around this was to wrap up the formula creation in local() so the constant is contained in an environment that gets shipped along to the worker alongside the formula.

library(future)

set.seed(123)

plan(multisession, workers = 2)

df <- data.frame(
  y = sample(10),
  x = sample(10)
)

constant <- rep(0, nrow(df))
formula <- y ~ x + constant

# This works
model.matrix(formula, data = df)
#>    (Intercept)  x constant
#> 1            1 10        0
#> 2            1  5        0
#> 3            1  3        0
#> 4            1  8        0
#> 5            1  1        0
#> 6            1  4        0
#> 7            1  6        0
#> 8            1  9        0
#> 9            1  7        0
#> 10           1  2        0
#> attr(,"assign")
#> [1] 0 1 2

# This doesn't
result <- future::future({
  model.matrix(formula, data = df)
})

# Uh oh
value(result)
#> Error in eval(predvars, data, env): object 'constant' not found

# //////////////////////////////////////////////////////////////////////////////

# globals is not seeing `constant`
globals::globalsOf(quote({
  model.matrix(formula, data = df)
}))
#> $`{`
#> .Primitive("{")
#> 
#> $model.matrix
#> function (object, ...) 
#> UseMethod("model.matrix")
#> <bytecode: 0x7fb5667dcbb0>
#> <environment: namespace:stats>
#> 
#> $formula
#> y ~ x + constant
#> 
#> $df
#>     y  x
#> 1   3 10
#> 2  10  5
#> 3   2  3
#> 4   8  8
#> 5   6  1
#> 6   9  4
#> 7   1  6
#> 8   7  9
#> 9   5  7
#> 10  4  2
#> 
#> attr(,"where")
#> attr(,"where")$`{`
#> <environment: base>
#> 
#> attr(,"where")$model.matrix
#> <environment: package:stats>
#> attr(,"name")
#> [1] "package:stats"
#> attr(,"path")
#> [1] "/Library/Frameworks/R.framework/Versions/4.2/Resources/library/stats"
#> 
#> attr(,"where")$formula
#> <environment: R_GlobalEnv>
#> 
#> attr(,"where")$df
#> <environment: R_GlobalEnv>
#> 
#> attr(,"class")
#> [1] "Globals" "list"

# Note that the environment of `formula` is the global env.
# Since the global env isn't serialized to the worker, the `constant` won't
# be available over on the worker

# //////////////////////////////////////////////////////////////////////////////

# Here is a trick that does work, taking advantage of the fact that a `formula`
# keeps track of its environment

formula <- local({
  constant <- rep(0, nrow(df))
  y ~ x + constant
})

# Note the env here isn't the global env
formula
#> y ~ x + constant
#> <environment: 0x7fb5706b6a38>

# So now this works
result <- future::future({
  model.matrix(formula, data = df)
})

value(result)
#>    (Intercept)  x constant
#> 1            1 10        0
#> 2            1  5        0
#> 3            1  3        0
#> 4            1  8        0
#> 5            1  1        0
#> 6            1  4        0
#> 7            1  6        0
#> 8            1  9        0
#> 9            1  7        0
#> 10           1  2        0
#> attr(,"assign")
#> [1] 0 1 2

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions