Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't create an entity set: AttributeError: 'EntitySet' object has no attribute 'entity_from_dataframe' #38

Open
jcalebsmith opened this issue Feb 26, 2022 · 2 comments

Comments

@jcalebsmith
Copy link

When following the instructions in the README, under the 'Creating and EntitySet' heading. The following code results in an error:

library(featuretoolsR)
library(magrittr)

set_1 <- data.frame(key = 1:100, value = sample(letters, 100, T), a = rep(Sys.Date(), 100))
set_2 <- data.frame(key = 1:100, value = sample(LETTERS, 100, T), b = rep(Sys.time(), 100))

es <- as_entityset(
  set_1, 
  index = "key", 
  entity_id = "set_1", 
  id = "demo", 
  time_index = "a"
)

The error states:

Error in py_get_attr_impl(x, name, silent) : 
  AttributeError: 'EntitySet' object has no attribute 'entity_from_dataframe'

Session Info:

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] magrittr_2.0.2      dplyr_1.0.8         foreign_0.8-81      featuretoolsR_0.4.4

loaded via a namespace (and not attached):
 [1] reticulate_1.24      tidyselect_1.1.2     purrr_0.3.4          reshape2_1.4.4       listenv_0.8.0       
 [6] splines_4.1.2        lattice_0.20-45      colorspace_2.0-3     vctrs_0.3.8          generics_0.1.2      
[11] stats4_4.1.2         utf8_1.2.2           survival_3.2-13      prodlim_2019.11.13   rlang_1.0.1         
[16] ModelMetrics_1.2.2.2 pillar_1.7.0         glue_1.6.2           withr_2.4.3          rappdirs_0.3.3      
[21] foreach_1.5.2        lifecycle_1.0.1      plyr_1.8.6           lava_1.6.10          stringr_1.4.0       
[26] timeDate_3043.102    munsell_0.5.0        gtable_0.3.0         future_1.24.0        recipes_0.2.0       
[31] codetools_0.2-18     caret_6.0-90         parallel_4.1.2       class_7.3-19         fansi_1.0.2         
[36] Rcpp_1.0.8           scales_1.1.1         ipred_0.9-12         jsonlite_1.8.0       parallelly_1.30.0   
[41] png_0.1-7            ggplot2_3.3.5        digest_0.6.29        stringi_1.7.6        rprojroot_2.0.2     
[46] grid_4.1.2           here_1.0.1           hardhat_0.2.0        cli_3.2.0            tools_4.1.2         
[51] tibble_3.1.6         crayon_1.5.0         future.apply_1.8.1   pkgconfig_2.0.3      ellipsis_0.3.2      
[56] MASS_7.3-54          Matrix_1.3-4         data.table_1.14.2    pROC_1.18.0          lubridate_1.8.0     
[61] gower_1.0.0          rstudioapi_0.13      iterators_1.0.14     R6_2.5.1             globals_0.14.0      
[66] rpart_4.1-15         nnet_7.3-16          nlme_3.1-153         compiler_4.1.2 
@jcalebsmith
Copy link
Author

Replacing the current version of add_entity.R corrects the issue:

#' add_entity
#' @description Add an entity to an entityset.
#' @export
#'
#' @param entityset The entity set to modify.
#' @param entity_id The name of the entity to add.
#' @param df The data frame to add as an entity.
#' @param index The index parameter specifies the column that uniquely identifies rows in the dataframe
#' @param time_index Name of the time column in the dataframe.
#' @param ... Additional parameters passed to `featuretools.entity_from_dataframe`.
#' @return A modified entityset.
#'
#' @examples
#' \donttest{
#' library(magrittr)
#' create_entityset("set") %>%
#'   add_entity(df = cars,
#'              entity_id = "cars",
#'              index = "row_number")
#' }
add_entity <- function(
  entityset,
  entity_id,
  df,
  index = NULL,
  time_index = NULL,
  ...
) {
  # Construct logical_types to handle factors as categorical variables.
  classes <- purrr::map_dfr(sapply(df, FUN = function(col) {
    c <- class(col)
    # prettify difficult data types
    if(length(c > 1))
      c <- paste0(c, collapse = ", ")
    return(c)
  }), c)

  logical_types = list() #initialize
  if (any(classes == "factor")) {
    for (i in 1:length(classes)) {
      suppressWarnings({
        if (class(df[, i]) == "factor") {
          logical_types[[names(df)[i]]] <- .Categorical
        }
      })
    }
  }

  logical_types <- reticulate::r_to_py(logical_types)

  # Add df as entity to entityset.
  es <- entityset$add_dataframe(
    dataframe_name = entity_id,
    dataframe = reticulate::r_to_py(x = df),
    index = index,
    time_index = time_index,
    logical_types = logical_types,
    ...
  )

  return(es)

}

@praktiskt
Copy link
Owner

Hi there, great that you found a solution. I'm not really working with R currently, but if you'd like to submit this as a PR (and update tests if required), I can create a new release for it (and send to CRAN).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants