Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problème sur la correction des données en utilisant les random forest #32

Open
jalazawa opened this issue Jan 5, 2018 · 0 comments
Open
Assignees
Milestone

Comments

@jalazawa
Copy link
Member

jalazawa commented Jan 5, 2018

J'essaie de suivre l'exemple décrit dans le package

> load_data <- augment_validation(load_db)
> aggregated_db <- aggregate_with_rules(load_data)
> aggregated_db_valid <- augment_validation(aggregated_db)
> aggregated_db_coorect <- data_correct_with_rules(aggregated_db_valid)
> aggregated_db_augment_correct <- augment_process_summary(aggregated_db_coorect)
> dat <- as_learning_db(aggregated_db_augment_correct )
> x_vars <- c(
+   "year.iso", "week.iso", "hour.iso",
+   "day.iso", "light_time", "is_off", "likely_off",
+   "DAILY_MIN_CTY_MINUS_1", "DAILY_AVG_CTY_MINUS_1", "DAILY_MAX_CTY_MINUS_1",
+   "HOUR_SHIFT_CTY_MINUS_1")
> dat <- define_model_rf( data = dat, x_vars = x_vars, y_var = "CTY",
+                         save_model_dir = file.path( getwd(), "ttt"),
+                         id = "BACKWARD" )

H2O is not running yet, starting it now...

Note:  In case of errors look at the following log files:
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.out
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.err

java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)

Starting H2O JVM and connecting: . Connection successful!

R is connected to the H2O cluster: 
    H2O cluster uptime:         3 seconds 23 milliseconds 
    H2O cluster version:        3.16.0.2 
    H2O cluster version age:    1 month and 5 days  
    H2O cluster name:           H2O_started_from_R_jalazawa_orv612 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   3.47 GB 
    H2O cluster total cores:    8 
    H2O cluster allowed cores:  8 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         Algos, AutoML, Core V3, Core V4 
    R Version:                  R version 3.4.2 (2017-09-28) 

  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
Warning messages:
1: In .h2o.startModelJob(algo, params, h2oRestApiVersion) :
  Dropping bad and constant columns: [is_off, likely_off].

2: In .h2o.startModelJob(algo, params, h2oRestApiVersion) :
  Dropping bad and constant columns: [is_off, likely_off].

> x_vars <- c(
+   "year.iso", "week.iso", "hour.iso", "day.iso", "light_time",
+   "is_off", "likely_off", "DAILY_MIN_CTY_PLUS_1",
+   "DAILY_AVG_CTY_PLUS_1", "DAILY_MAX_CTY_PLUS_1", "HOUR_SHIFT_CTY_PLUS_1")
> dat <- define_model_rf( data = dat, x_vars = x_vars, y_var = "CTY",
+                         save_model_dir = file.path( getwd(), "ttt"),
+                         id = "FORWARD" )

H2O is not running yet, starting it now...

Note:  In case of errors look at the following log files:
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.out
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.err

java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)

Starting H2O JVM and connecting: . Connection successful!

R is connected to the H2O cluster: 
    H2O cluster uptime:         3 seconds 64 milliseconds 
    H2O cluster version:        3.16.0.2 
    H2O cluster version age:    1 month and 5 days  
    H2O cluster name:           H2O_started_from_R_jalazawa_ghv498 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   3.47 GB 
    H2O cluster total cores:    8 
    H2O cluster allowed cores:  8 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         Algos, AutoML, Core V3, Core V4 
    R Version:                  R version 3.4.2 (2017-09-28) 

  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
Warning messages:
1: In .h2o.startModelJob(algo, params, h2oRestApiVersion) :
  Dropping bad and constant columns: [is_off, likely_off].

2: In .h2o.startModelJob(algo, params, h2oRestApiVersion) :
  Dropping bad and constant columns: [is_off, likely_off].

> for(i in 1:2 ){
+   dat <- impute_with_model(dat, id = "FORWARD")
+   dat <- impute_with_model(dat, id = "BACKWARD")
+   dat <- update_learning_db(dat)
+ }

H2O is not running yet, starting it now...

Note:  In case of errors look at the following log files:
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.out
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.err

java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)

Starting H2O JVM and connecting: . Connection successful!

R is connected to the H2O cluster: 
    H2O cluster uptime:         3 seconds 116 milliseconds 
    H2O cluster version:        3.16.0.2 
    H2O cluster version age:    1 month and 5 days  
    H2O cluster name:           H2O_started_from_R_jalazawa_kox324 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   3.47 GB 
    H2O cluster total cores:    8 
    H2O cluster allowed cores:  8 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         Algos, AutoML, Core V3, Core V4 
    R Version:                  R version 3.4.2 (2017-09-28) 

  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
 Connection successful!

R is connected to the H2O cluster: 
    H2O cluster uptime:         11 seconds 430 milliseconds 
    H2O cluster version:        3.16.0.2 
    H2O cluster version age:    1 month and 5 days  
    H2O cluster name:           H2O_started_from_R_jalazawa_kox324 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   2.98 GB 
    H2O cluster total cores:    8 
    H2O cluster allowed cores:  8 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         Algos, AutoML, Core V3, Core V4 
    R Version:                  R version 3.4.2 (2017-09-28) 

  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = page,  : 
  Unexpected CURL error: Empty reply from server

J'ai l'impression qu'on fait une requête impossible, est-ce qu'il ne manquerait pas une donnée dans un des modèles ?

Ou est-ce que les modèles ne peuvent pas s'appliquer à certains pays comme NorthIRELAND car il y'a trop de données manquantes ou à zéro ?

Si j'utilise seulement les modèles "forward", ça à l'air de fonctionner :

for(i in 1:2 ){
+   dat <- impute_with_model(dat, id = "FORWARD")
+   dat <- update_learning_db(dat)
+ }

H2O is not running yet, starting it now...

Note:  In case of errors look at the following log files:
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.out
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.err

java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)

Starting H2O JVM and connecting: . Connection successful!

R is connected to the H2O cluster: 
    H2O cluster uptime:         3 seconds 157 milliseconds 
    H2O cluster version:        3.16.0.2 
    H2O cluster version age:    1 month and 5 days  
    H2O cluster name:           H2O_started_from_R_jalazawa_gwy137 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   3.47 GB 
    H2O cluster total cores:    8 
    H2O cluster allowed cores:  8 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         Algos, AutoML, Core V3, Core V4 
    R Version:                  R version 3.4.2 (2017-09-28) 

  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%

H2O is not running yet, starting it now...

Note:  In case of errors look at the following log files:
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.out
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.err

java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)

Starting H2O JVM and connecting: . Connection successful!

R is connected to the H2O cluster: 
    H2O cluster uptime:         3 seconds 97 milliseconds 
    H2O cluster version:        3.16.0.2 
    H2O cluster version age:    1 month and 5 days  
    H2O cluster name:           H2O_started_from_R_jalazawa_nfy087 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   3.47 GB 
    H2O cluster total cores:    8 
    H2O cluster allowed cores:  8 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         Algos, AutoML, Core V3, Core V4 
    R Version:                  R version 3.4.2 (2017-09-28) 

  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%

Si j'utilise seulement les modèles backward, tout se passe bien:

> for(i in 1:2 ){
+   dat <- impute_with_model(dat, id = "BACKWARD")
+   dat <- update_learning_db(dat)
+ }

H2O is not running yet, starting it now...

Note:  In case of errors look at the following log files:
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.out
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.err

java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)

Starting H2O JVM and connecting: . Connection successful!

R is connected to the H2O cluster: 
    H2O cluster uptime:         3 seconds 237 milliseconds 
    H2O cluster version:        3.16.0.2 
    H2O cluster version age:    1 month and 5 days  
    H2O cluster name:           H2O_started_from_R_jalazawa_apt214 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   3.47 GB 
    H2O cluster total cores:    8 
    H2O cluster allowed cores:  8 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         Algos, AutoML, Core V3, Core V4 
    R Version:                  R version 3.4.2 (2017-09-28) 

  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%

H2O is not running yet, starting it now...

Note:  In case of errors look at the following log files:
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.out
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.err

java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)

Starting H2O JVM and connecting: . Connection successful!

R is connected to the H2O cluster: 
    H2O cluster uptime:         3 seconds 153 milliseconds 
    H2O cluster version:        3.16.0.2 
    H2O cluster version age:    1 month and 5 days  
    H2O cluster name:           H2O_started_from_R_jalazawa_pjg433 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   3.47 GB 
    H2O cluster total cores:    8 
    H2O cluster allowed cores:  8 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         Algos, AutoML, Core V3, Core V4 
    R Version:                  R version 3.4.2 (2017-09-28) 

  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%

maintenant si je réutilise les deux modèles à la fois, je retombe sur le problème initiale

> for(i in 1:2 ){
+   dat <- impute_with_model(dat, id = "FORWARD")
+   dat <- impute_with_model(dat, id = "BACKWARD")
+   dat <- update_learning_db(dat)
+ }

H2O is not running yet, starting it now...

Note:  In case of errors look at the following log files:
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.out
    D:\Users\jalazawa\AppData\Local\Temp\RtmpmAvqvl/h2o_jalazawa_started_from_r.err

java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)

Starting H2O JVM and connecting: . Connection successful!

R is connected to the H2O cluster: 
    H2O cluster uptime:         3 seconds 141 milliseconds 
    H2O cluster version:        3.16.0.2 
    H2O cluster version age:    1 month and 5 days  
    H2O cluster name:           H2O_started_from_R_jalazawa_nsj808 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   3.47 GB 
    H2O cluster total cores:    8 
    H2O cluster allowed cores:  8 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         Algos, AutoML, Core V3, Core V4 
    R Version:                  R version 3.4.2 (2017-09-28) 

  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
 Connection successful!

R is connected to the H2O cluster: 
    H2O cluster uptime:         11 seconds 773 milliseconds 
    H2O cluster version:        3.16.0.2 
    H2O cluster version age:    1 month and 5 days  
    H2O cluster name:           H2O_started_from_R_jalazawa_nsj808 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   3.14 GB 
    H2O cluster total cores:    8 
    H2O cluster allowed cores:  8 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         Algos, AutoML, Core V3, Core V4 
    R Version:                  R version 3.4.2 (2017-09-28) 

  |================================================================| 100%
  |================================================================| 100%
  |================================================================| 100%
  |                                                                |   0%Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = urlSuffix,  : 
  Unexpected CURL error: Failed to connect to localhost port 54321: Connection refused
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants