Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PRCODE(.) and SET_PRVALUE() are not API #471

Closed
mmaechler opened this issue Oct 21, 2024 · 4 comments
Closed

PRCODE(.) and SET_PRVALUE() are not API #471

mmaechler opened this issue Oct 21, 2024 · 4 comments

Comments

@mmaechler
Copy link
Collaborator

The first is if (PRCODE(arg) == R_MissingArg) { .... } here

if (PRCODE(arg) == R_MissingArg) {

and basically identically (with a != instead of ==) also on line 156 .

As we know, we (R core) have "solidified" R's C API recently, and visibly in R-devel (to be released in Spring 2025),
and with (somewhat extended) tests, one now gets

* checking compiled code ... NOTE
File ‘S7/libs/S7.so’:
  Found non-API calls to R: ‘PRCODE’, ‘SET_PRVALUE’

Compiled code should not call non-API entry points in R.

See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual,
and section ‘Moving into C API compliance’ for issues with the use of
non-API entry points.

The WRE section is https://cran.r-project.org/doc/manuals/r-devel/R-exts.html#Moving-into-C-API-compliance

@t-kalinowski
Copy link
Member

t-kalinowski commented Oct 21, 2024

Is there an API for working with promises?

I might be mistaken, but I don't think there are API entrypoints yet for working with promises. As far as I can tell, there is currently no approved way to:

  • Determine if a symbol is a promise in a frame.
  • Get a reference to a promise in a frame.
  • Safely test if a symbol is R_missingArg without calling R-level base::missing().
  • Access the promise expression or environment.

@t-kalinowski
Copy link
Member

t-kalinowski commented Oct 29, 2024

Following up on this, I've experimented with some approaches to avoid directly accessing promise internals while maintaining efficient method dispatch. In PR #483, I drafted a solution using Rf_eval() and an R-level call to base::missing() to avoid accessing promise internals. While this works in most cases, there's one significant regression: when a method calls substitute(some_arg), the original promise expression is unavailable for dispatch arguments. This regression makes the solution unsuitable for release.

However, another potential approach uses the relatively new Exec() function to perform dispatch while preserving dispatch argument promises for method introspection. Since Exec() is relatively new and wasn't specifically designed for this purpose, we should likely defer this implementation to a subsequent release. This would give us time to identify any unintended consequences. Additionally, we'll need to maintain the current implementation for older R versions where Exec() isn't available.

Here's a sketch of how method dispatch might work using Exec():

some_generic <- function(x, ...) {
  force(x) # would need to check if `x` is missing
  S7_dispatch()
}
attr(some_generic, "name") <- "some_generic" 

some_method <- function(x, ..., y = 1) {
  list(
    "sys.call()" = rlang::zap_srcref(sys.call()),
    "substitute(x)" = substitute(x),
    "substitute(y)" = substitute(y),
    "substitute(list(...))" = substitute(list(...)),
    x = x,
    y = y,
    "list(...)" = list(...)
  )
}

S7_dispatch <- function() {
  Exec(.prep_dispatched_call(sys.call(-1), sys.function(-1), sys.frame(-1)))
}

.find_method <- function(call_, generic_, env_) {
  # this would be in C
  some_method
}

.prep_dispatched_call <- function(call_, generic_, env_) {
  # this would be in C
  method <- .find_method(call_, generic_, env_)
  generic_name <- attr(some_generic, "name", TRUE)
  
  # define/mask the generic name in the S7_dispatch() call frame
  assign(generic_name, method, parent.frame()) 
  
  # ensure the call is to the generic name (it usualy all ready is, unless the
  # head of the generic call was itself a call or inlined closure)
  call_[[1]] <- as.name(generic_name) 
  call_
}


str(some_generic(3))
#> List of 7
#>  $ sys.call()           : language some_generic(3)
#>  $ substitute(x)        : num 3
#>  $ substitute(y)        : num 1
#>  $ substitute(list(...)): language list()
#>  $ x                    : num 3
#>  $ y                    : num 1
#>  $ list(...)            : list()

sym <- 3
str(some_generic(sym, 4))
#> List of 7
#>  $ sys.call()           : language some_generic(sym, 4)
#>  $ substitute(x)        : symbol sym
#>  $ substitute(y)        : num 1
#>  $ substitute(list(...)): language list(4)
#>  $ x                    : num 3
#>  $ y                    : num 1
#>  $ list(...)            :List of 1
#>   ..$ : num 4

str(some_generic(sym, (4)))
#> List of 7
#>  $ sys.call()           : language some_generic(sym, (4))
#>  $ substitute(x)        : symbol sym
#>  $ substitute(y)        : num 1
#>  $ substitute(list(...)): language list((4))
#>  $ x                    : num 3
#>  $ y                    : num 1
#>  $ list(...)            :List of 1
#>   ..$ : num 4

str(some_generic(sym, (4), y = 99))
#> List of 7
#>  $ sys.call()           : language some_generic(sym, (4), y = 99)
#>  $ substitute(x)        : symbol sym
#>  $ substitute(y)        : num 99
#>  $ substitute(list(...)): language list((4))
#>  $ x                    : num 3
#>  $ y                    : num 99
#>  $ list(...)            :List of 1
#>   ..$ : num 4

str((some_generic)(sym, (4), y = 99))
#> List of 7
#>  $ sys.call()           : language some_generic(sym, (4), y = 99)
#>  $ substitute(x)        : symbol sym
#>  $ substitute(y)        : num 99
#>  $ substitute(list(...)): language list((4))
#>  $ x                    : num 3
#>  $ y                    : num 99
#>  $ list(...)            :List of 1
#>   ..$ : num 4

str((some_generic)(sym, (4), y = sym))
#> List of 7
#>  $ sys.call()           : language some_generic(sym, (4), y = sym)
#>  $ substitute(x)        : symbol sym
#>  $ substitute(y)        : symbol sym
#>  $ substitute(list(...)): language list((4))
#>  $ x                    : num 3
#>  $ y                    : num 3
#>  $ list(...)            :List of 1
#>   ..$ : num 4

Created on 2024-10-29 with reprex v2.1.1

@t-kalinowski
Copy link
Member

t-kalinowski commented Oct 29, 2024

Continuing on this just a little further, one of the challenges that's most difficult to workaround with the current API is determining if a symbol is bound to R_MissingArg in a frame; currently, the only "API" way is to call out to R-level base::missing().

I ran benchmarks to measure the overhead of calling base::missing() from C via Rf_eval() versus the non-API pattern of Rf_findVarInFrame(sym, env) == R_MissingArg. The "API" approach is consistently 20-50% slower. While the benchmark calls themselves include some non-negligible overhead, this overhead should be consistent between the two approaches. In absolute terms, on my (somewhat dated) Linux box, the overhead of using the API is approximately 500 ns.

It would be ideal if R_missing were made available in the C API.

Benchmark setup
#define R_NO_REMAP
#include <R.h>
#include <Rinternals.h>
#include <R_ext/Rdynload.h>

Rboolean is_missing_api(SEXP name_sym, SEXP envir) {
  static SEXP missing_call = NULL;
  if (missing_call == NULL) {
    SEXP missing_fn = Rf_eval(Rf_install("missing"), R_BaseEnv);
    missing_call = Rf_lang2(missing_fn, R_NilValue);
    R_PreserveObject(missing_call);
  }

  SETCADR(missing_call, name_sym);
  
  SEXP result = PROTECT(Rf_eval(missing_call, envir));
  Rboolean is_miss = Rf_asLogical(result);
  UNPROTECT(1);
  
  return is_miss;
}

Rboolean is_missing_nonapi(SEXP name_sym, SEXP envir) {
  SEXP val = Rf_findVarInFrame(envir, name_sym);
  return (Rboolean) (val == R_MissingArg);
}

SEXP is_missing_api_(SEXP name_sym, SEXP envir) {
  return Rf_ScalarLogical(is_missing_api(name_sym, envir));
}

SEXP is_missing_nonapi_(SEXP name_sym, SEXP envir) {
  return Rf_ScalarLogical(is_missing_nonapi(name_sym, envir));
}

static const R_CallMethodDef callMethods[]  = {
  {"is_missing_api_"   , (DL_FUNC) &is_missing_api_   , 2},
  {"is_missing_nonapi_", (DL_FUNC) &is_missing_nonapi_, 2},
  {NULL, NULL, 0}
};

void R_init_benchmarks(DllInfo *dll) {
  R_registerRoutines(dll, NULL, callMethods, NULL, NULL);
  R_useDynamicSymbols(dll, FALSE);
}
## usethis namespace: start
#' @useDynLib benchmarks, .registration = TRUE
## usethis namespace: end
NULL

#' @export
is_missing_api <- function(sym, env) {
  .Call(is_missing_api_, sym, env)
}

#' @export
is_missing_nonapi <- function(sym, env) {
  .Call(is_missing_nonapi_, sym, env)
}
Benchmark
api <- function(x) {
  is_missing_api(quote(x), environment())
}

nonapi <- function(x) {
  is_missing_nonapi(quote(x), environment())
}

bench::mark(
  api(),
  nonapi(),
) |> print() |> plot() |> print()

Results

# A tibble: 2 × 13
  expression      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result    memory     time      
  <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>    <list>     <list>    
1 api()         942ns   1.49µs   407544.        0B        0 10000     0     24.5ms <lgl [1]> <Rprofmem> <bench_tm>
2 nonapi()      831ns 918.05ns   887416.        0B        0 10000     0     11.3ms <lgl [1]> <Rprofmem> <bench_tm>
# ℹ 1 more variable: gc <list>

image

@hadley
Copy link
Member

hadley commented Nov 7, 2024

Fixed by #483 (with no performance regression)

@hadley hadley closed this as completed Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants