diff --git a/docs/_freeze/posts/ibisml/index/execute-results/html.json b/docs/_freeze/posts/ibisml/index/execute-results/html.json
new file mode 100644
index 000000000000..950118eafda8
--- /dev/null
+++ b/docs/_freeze/posts/ibisml/index/execute-results/html.json
@@ -0,0 +1,19 @@
+{
+  "hash": "7961865bb3c84357060e112e10cd8afb",
+  "result": {
+    "engine": "jupyter",
+    "markdown": "---\ntitle: \"Using IbisML and DuckDB for a Kaggle competition: credit risk model stability\"\nauthor: \"Jiting Xu\"\ndate: \"2024-08-22\"\ncategories:\n    - blog\n    - duckdb\n    - machine learning\n    - feature engineering\n---\n\n## Introduction\nIn this post, we'll demonstrate how to use Ibis and [IbisML](https://github.com/ibis-project/ibis-ml)\nend-to-end for the\n[credit risk model stability Kaggle competition](https://www.kaggle.com/competitions/home-credit-credit-risk-model-stability).\n\n1. Load data and perform feature engineering on DuckDB backend using IbisML\n2. Perform last-mile ML data preprocessing on DuckDB backend using IbisML\n3. Train two models using different frameworks:\n    * An XGBoost model within a scikit-learn pipeline.\n    * A neural network with PyTorch and PyTorch Lightning.\n\nThe aim of this competition is to predict which clients are more likely to default on their\nloans by using both internal and external data sources.\n\nTo get started with Ibis and IbisML, please refer to the websites:\n\n* [Ibis](https://ibis-project.org/): An open-source dataframe library that works with any data system.\n* [IbisML](https://ibis-project.github.io/ibis-ml/): A library for building scalable ML pipelines.\n\n\n## Prerequisites\nTo run this example, you'll need to download the data from Kaggle website with a Kaggle user account and install Ibis, IbisML, and the necessary modeling library.\n\n### Download data\nYou need a Kaggle account to download the data. If you do not have one,\nfeel free to register one.\n\n1. Option 1: Manual download\n     * Log into your Kaggle account and download all data from this\n     [link](https://www.kaggle.com/competitions/home-credit-credit-risk-model-stability/data),\n     unzip the files, and save them to your local disk.\n2. Option 2: Kaggle API\n    * Go to your `Kaggle Account Settings`.\n    * Under the `API` section, click on `Create New API Token`. This will download the `kaggle.json`\n    file to your computer.\n    * Place the `kaggle.json` file in the correct directory, normally it is under your home directory\n    `~/.kaggle`:\n\n        ```bash\n        mkdir ~/.kaggle\n        mv ~/Downloads/kaggle.json ~/.kaggle\n        ```\n    * Install Kaggle CLI and download the data:\n\n        ```bash\n        pip install kaggle\n        kaggle competitions download -c home-credit-credit-risk-model-stability\n        unzip home-credit-credit-risk-model-stability.zip\n        ```\n\n### Install libraries\nTo use Ibis and IbisML with the DuckDB backend for building models, you'll need to install the\nnecessary packages. Depending on your preferred machine learning framework, you can choose\none of the following installation commands:\n\nFor PyTorch-based models:\n\n```{.bash}\npip install 'ibis-framework[duckdb]' ibis-ml torch pytorch-lightning\n```\n\nFor XGBoost and scikit-learn-based models:\n\n```{.bash}\npip install 'ibis-framework[duckdb]' ibis-ml xgboost[scikit-learn]\n```\n\nImport libraries:\n\n::: {#dba3e08c .cell execution_count=2}\n``` {.python .cell-code}\nimport ibis\nimport ibis.expr.datatypes as dt\nfrom ibis import _\nimport ibis_ml as ml\nfrom pathlib import Path\nfrom glob import glob\n\n# enable interactive mode for ibis\nibis.options.interactive = True\n```\n:::\n\n\nSet the backend for computing:\n\n::: {#d4b26ac9 .cell execution_count=3}\n``` {.python .cell-code}\ncon = ibis.duckdb.connect()\n# remove the black bars from duckdb's progress bar\ncon.raw_sql(\"set enable_progress_bar = false\")\n# DuckDB is the default backend for Ibis\nibis.set_backend(con)\n```\n:::\n\n\nSet data path:\n\n::: {#9930c4ad .cell execution_count=4}\n``` {.python .cell-code}\n# change the root path to yours\nROOT = Path(\"/Users/claypot/Downloads/home-credit-credit-risk-model-stability\")\nTRAIN_DIR = ROOT / \"parquet_files\" / \"train\"\nTEST_DIR = ROOT / \"parquet_files\" / \"test\"\n```\n:::\n\n\n## Data loading and processing\nWe'll use Ibis to read the Parquet files and perform the necessary processing for the next step.\n\n### Directory structure and tables\nSince there are many data files, let's start by examining the directory structure and\ntables within the train directory:\n\n```bash\n# change this to your directory\ntree -L 2 ~/Downloads/home-credit-credit-risk-model-stability/parquet_files/train\n```\n\n:::{.callout-note title=\"Click to show data files\" collapse=\"true\"}\n\n```bash\n~/Downloads/home-credit-credit-risk-model-stability/parquet_files/train\n├── train_applprev_1_0.parquet\n├── train_applprev_1_1.parquet\n├── train_applprev_2.parquet\n├── train_base.parquet\n├── train_credit_bureau_a_1_0.parquet\n├── train_credit_bureau_a_1_1.parquet\n├── train_credit_bureau_a_1_3.parquet\n├── train_credit_bureau_a_2_0.parquet\n├── train_credit_bureau_a_2_1.parquet\n├── train_credit_bureau_a_2_10.parquet\n├── train_credit_bureau_a_2_2.parquet\n├── train_credit_bureau_a_2_3.parquet\n├── train_credit_bureau_a_2_4.parquet\n├── train_credit_bureau_a_2_5.parquet\n├── train_credit_bureau_a_2_6.parquet\n├── train_credit_bureau_a_2_7.parquet\n├── train_credit_bureau_a_2_8.parquet\n├── train_credit_bureau_a_2_9.parquet\n├── train_credit_bureau_b_1.parquet\n├── train_credit_bureau_b_2.parquet\n├── train_debitcard_1.parquet\n├── train_deposit_1.parquet\n├── train_other_1.parquet\n├── train_person_1.parquet\n├── train_person_2.parquet\n├── train_static_0_0.parquet\n├── train_static_0_1.parquet\n├── train_static_cb_0.parquet\n├── train_tax_registry_a_1.parquet\n├── train_tax_registry_b_1.parquet\n└── train_tax_registry_c_1.parquet\n```\n\n:::\n\nThe `train_base.parquet` file is the base table, while the others are feature tables.\nLet's take a quick look at these tables.\n\n#### Base table\nThe base table (`train_base.parquet`) contains the unique ID, a binary target flag\nand other information for the training samples. This unique ID will serve as the\nlinking key for joining with other feature tables.\n\n* `case_id` - This is the unique ID for each loan. You'll need this ID to\n  join feature tables to the base table. There are about 1.5m unique loans.\n* `date_decision` - This refers to the date when a decision was made regarding the\n  approval of the loan.\n* `WEEK_NUM` - This is the week number used for aggregation. In the test sample,\n    `WEEK_NUM` continues sequentially from the last training value of `WEEK_NUM`.\n* `MONTH` - This column represents the month when the approval decision was made.\n* `target` - This is the binary target flag, determined after a certain period based on\n  whether or not the client defaulted on the specific loan.\n\nHere is several examples from the base table:\n\n::: {#54d40a5b .cell execution_count=5}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to get the top 5 rows of base table\"}\nibis.read_parquet(TRAIN_DIR / \"train_base.parquet\").head(5)\n```\n\n::: {.cell-output .cell-output-display execution_count=36}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> case_id </span>┃<span style=\"font-weight: bold\"> date_decision </span>┃<span style=\"font-weight: bold\"> MONTH  </span>┃<span style=\"font-weight: bold\"> WEEK_NUM </span>┃<span style=\"font-weight: bold\"> target </span>┃\n┡━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>  │\n├─────────┼───────────────┼────────┼──────────┼────────┤\n│       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-03   </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201901</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │\n│       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-03   </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201901</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │\n│       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-04   </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201901</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │\n│       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-03   </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201901</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │\n│       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-04   </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">201901</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> │\n└─────────┴───────────────┴────────┴──────────┴────────┘\n</pre>\n```\n:::\n:::\n\n\n#### Feature tables\nThe remaining files contain features, consisting of approximately 370 features from\nprevious loan applications and external data sources. Their definitions can be found in the feature\ndefinition [file](https://www.kaggle.com/competitions/home-credit-credit-risk-model-stability/data)\nfrom the competition website.\n\nThere are several things we want to mention for the feature tables:\n\n* **Union datasets**: One dataset could be saved into multiple parquet files, such as\n`train_applprev_1_0.parquet` and `train_applprev_1_1.parquet`, We need to union this data.\n* **Dataset levels**: Datasets may have different levels, which we will explain as\nfollows:\n     * **Depth = 0**: Each row in the table is identified by a unique `case_id`.\n     In this case, you can directly join the features with the base table and use them as\n     features for further analysis or processing.\n     * **Depth > 0**:  You will group the data based on the `case_id` and perform calculations\n     or aggregations within each group.\n\nHere are two examples of tables with different levels.\n\nExample of table with depth = 0, `case_id` is the row identifier, features can be directly joined\n with the base table.\n\n::: {#fdaf873c .cell execution_count=6}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to get the top 5 rows of user static data\"}\nibis.read_parquet(TRAIN_DIR / \"train_static_cb_0.parquet\").head(5)\n```\n\n::: {.cell-output .cell-output-display execution_count=37}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> case_id </span>┃<span style=\"font-weight: bold\"> assignmentdate_238D </span>┃<span style=\"font-weight: bold\"> assignmentdate_4527235D </span>┃<span style=\"font-weight: bold\"> assignmentdate_4955616D </span>┃<span style=\"font-weight: bold\"> birthdate_574D </span>┃<span style=\"font-weight: bold\"> contractssum_5085716L </span>┃<span style=\"font-weight: bold\"> dateofbirth_337D </span>┃<span style=\"font-weight: bold\"> dateofbirth_342D </span>┃<span style=\"font-weight: bold\"> days120_123L </span>┃<span style=\"font-weight: bold\"> days180_256L </span>┃<span style=\"font-weight: bold\"> days30_165L </span>┃<span style=\"font-weight: bold\"> days360_512L </span>┃<span style=\"font-weight: bold\"> days90_310L </span>┃<span style=\"font-weight: bold\"> description_5085714M </span>┃<span style=\"font-weight: bold\"> education_1103M </span>┃<span style=\"font-weight: bold\"> education_88M </span>┃<span style=\"font-weight: bold\"> firstquarter_103L </span>┃<span style=\"font-weight: bold\"> for3years_128L </span>┃<span style=\"font-weight: bold\"> for3years_504L </span>┃<span style=\"font-weight: bold\"> for3years_584L </span>┃<span style=\"font-weight: bold\"> formonth_118L </span>┃<span style=\"font-weight: bold\"> formonth_206L </span>┃<span style=\"font-weight: bold\"> formonth_535L </span>┃<span style=\"font-weight: bold\"> forquarter_1017L </span>┃<span style=\"font-weight: bold\"> forquarter_462L </span>┃<span style=\"font-weight: bold\"> forquarter_634L </span>┃<span style=\"font-weight: bold\"> fortoday_1092L </span>┃<span style=\"font-weight: bold\"> forweek_1077L </span>┃<span style=\"font-weight: bold\"> forweek_528L </span>┃<span style=\"font-weight: bold\"> forweek_601L </span>┃<span style=\"font-weight: bold\"> foryear_618L </span>┃<span style=\"font-weight: bold\"> foryear_818L </span>┃<span style=\"font-weight: bold\"> foryear_850L </span>┃<span style=\"font-weight: bold\"> fourthquarter_440L </span>┃<span style=\"font-weight: bold\"> maritalst_385M </span>┃<span style=\"font-weight: bold\"> maritalst_893M </span>┃<span style=\"font-weight: bold\"> numberofqueries_373L </span>┃<span style=\"font-weight: bold\"> pmtaverage_3A </span>┃<span style=\"font-weight: bold\"> pmtaverage_4527227A </span>┃<span style=\"font-weight: bold\"> pmtaverage_4955615A </span>┃<span style=\"font-weight: bold\"> pmtcount_4527229L </span>┃<span style=\"font-weight: bold\"> pmtcount_4955617L </span>┃<span style=\"font-weight: bold\"> pmtcount_693L </span>┃<span style=\"font-weight: bold\"> pmtscount_423L </span>┃<span style=\"font-weight: bold\"> pmtssum_45A </span>┃<span style=\"font-weight: bold\"> requesttype_4525192L </span>┃<span style=\"font-weight: bold\"> responsedate_1012D </span>┃<span style=\"font-weight: bold\"> responsedate_4527233D </span>┃<span style=\"font-weight: bold\"> responsedate_4917613D </span>┃<span style=\"font-weight: bold\"> riskassesment_302T </span>┃<span style=\"font-weight: bold\"> riskassesment_940T </span>┃<span style=\"font-weight: bold\"> secondquarter_766L </span>┃<span style=\"font-weight: bold\"> thirdquarter_1082L </span>┃\n┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>              │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>              │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>            │\n├─────────┼─────────────────────┼─────────────────────────┼─────────────────────────┼────────────────┼───────────────────────┼──────────────────┼──────────────────┼──────────────┼──────────────┼─────────────┼──────────────┼─────────────┼──────────────────────┼─────────────────┼───────────────┼───────────────────┼────────────────┼────────────────┼────────────────┼───────────────┼───────────────┼───────────────┼──────────────────┼─────────────────┼─────────────────┼────────────────┼───────────────┼──────────────┼──────────────┼──────────────┼──────────────┼──────────────┼────────────────────┼────────────────┼────────────────┼──────────────────────┼───────────────┼─────────────────────┼─────────────────────┼───────────────────┼───────────────────┼───────────────┼────────────────┼─────────────┼──────────────────────┼────────────────────┼───────────────────────┼───────────────────────┼────────────────────┼────────────────────┼────────────────────┼────────────────────┤\n│     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">357</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                    │ <span style=\"color: #008000; text-decoration-color: #008000\">1988-04-01    </span> │                  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>             │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1            </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1     </span> │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6.0</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6301.4000</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-25        </span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>               │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │\n│     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">381</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                    │ <span style=\"color: #008000; text-decoration-color: #008000\">1973-11-01    </span> │                  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>             │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1            </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1     </span> │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6.0</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4019.6000</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-25        </span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>               │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │\n│     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">388</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                    │ <span style=\"color: #008000; text-decoration-color: #008000\">1989-04-01    </span> │                  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">1989-04-01      </span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>             │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6.0</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8.0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2.0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10.0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1            </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1     </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2.0</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │                 <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10.0</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6.0</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">14548.0000</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-28        </span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>               │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5.0</span> │\n│     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">405</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                    │ <span style=\"color: #008000; text-decoration-color: #008000\">1974-03-01    </span> │                  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">1974-03-01      </span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>             │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1            </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1     </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6.0</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10498.2400</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-21        </span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>               │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │\n│     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">409</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                    │ <span style=\"color: #008000; text-decoration-color: #008000\">1993-06-01    </span> │                  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">1993-06-01      </span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>             │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2.0</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1            </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">717ddd49       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1     </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.0</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a7fcb6e5      </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.0</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7.0</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6344.8804</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-21        </span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>               │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.0</span> │\n└─────────┴─────────────────────┴─────────────────────────┴─────────────────────────┴────────────────┴───────────────────────┴──────────────────┴──────────────────┴──────────────┴──────────────┴─────────────┴──────────────┴─────────────┴──────────────────────┴─────────────────┴───────────────┴───────────────────┴────────────────┴────────────────┴────────────────┴───────────────┴───────────────┴───────────────┴──────────────────┴─────────────────┴─────────────────┴────────────────┴───────────────┴──────────────┴──────────────┴──────────────┴──────────────┴──────────────┴────────────────────┴────────────────┴────────────────┴──────────────────────┴───────────────┴─────────────────────┴─────────────────────┴───────────────────┴───────────────────┴───────────────┴────────────────┴─────────────┴──────────────────────┴────────────────────┴───────────────────────┴───────────────────────┴────────────────────┴────────────────────┴────────────────────┴────────────────────┘\n</pre>\n```\n:::\n:::\n\n\nExample of a table with depth = 1, we need to aggregate the features and collect statistics\nbased on `case_id` then join with the base table.\n\n::: {#4765269b .cell execution_count=7}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to get the top 5 rows of credit bureau data\"}\nibis.read_parquet(TRAIN_DIR / \"train_credit_bureau_b_1.parquet\").relocate(\n    \"num_group1\"\n).order_by([\"case_id\", \"num_group1\"]).head(5)\n```\n\n::: {.cell-output .cell-output-display execution_count=38}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> num_group1 </span>┃<span style=\"font-weight: bold\"> case_id </span>┃<span style=\"font-weight: bold\"> amount_1115A </span>┃<span style=\"font-weight: bold\"> classificationofcontr_1114M </span>┃<span style=\"font-weight: bold\"> contractdate_551D </span>┃<span style=\"font-weight: bold\"> contractmaturitydate_151D </span>┃<span style=\"font-weight: bold\"> contractst_516M </span>┃<span style=\"font-weight: bold\"> contracttype_653M </span>┃<span style=\"font-weight: bold\"> credlmt_1052A </span>┃<span style=\"font-weight: bold\"> credlmt_228A </span>┃<span style=\"font-weight: bold\"> credlmt_3940954A </span>┃<span style=\"font-weight: bold\"> credor_3940957M </span>┃<span style=\"font-weight: bold\"> credquantity_1099L </span>┃<span style=\"font-weight: bold\"> credquantity_984L </span>┃<span style=\"font-weight: bold\"> debtpastduevalue_732A </span>┃<span style=\"font-weight: bold\"> debtvalue_227A </span>┃<span style=\"font-weight: bold\"> dpd_550P </span>┃<span style=\"font-weight: bold\"> dpd_733P </span>┃<span style=\"font-weight: bold\"> dpdmax_851P </span>┃<span style=\"font-weight: bold\"> dpdmaxdatemonth_804T </span>┃<span style=\"font-weight: bold\"> dpdmaxdateyear_742T </span>┃<span style=\"font-weight: bold\"> installmentamount_644A </span>┃<span style=\"font-weight: bold\"> installmentamount_833A </span>┃<span style=\"font-weight: bold\"> instlamount_892A </span>┃<span style=\"font-weight: bold\"> interesteffectiverate_369L </span>┃<span style=\"font-weight: bold\"> interestrateyearly_538L </span>┃<span style=\"font-weight: bold\"> lastupdate_260D </span>┃<span style=\"font-weight: bold\"> maxdebtpduevalodued_3940955A </span>┃<span style=\"font-weight: bold\"> numberofinstls_810L </span>┃<span style=\"font-weight: bold\"> overdueamountmax_950A </span>┃<span style=\"font-weight: bold\"> overdueamountmaxdatemonth_494T </span>┃<span style=\"font-weight: bold\"> overdueamountmaxdateyear_432T </span>┃<span style=\"font-weight: bold\"> periodicityofpmts_997L </span>┃<span style=\"font-weight: bold\"> periodicityofpmts_997M </span>┃<span style=\"font-weight: bold\"> pmtdaysoverdue_1135P </span>┃<span style=\"font-weight: bold\"> pmtmethod_731M </span>┃<span style=\"font-weight: bold\"> pmtnumpending_403L </span>┃<span style=\"font-weight: bold\"> purposeofcred_722M </span>┃<span style=\"font-weight: bold\"> residualamount_1093A </span>┃<span style=\"font-weight: bold\"> residualamount_127A </span>┃<span style=\"font-weight: bold\"> residualamount_3940956A </span>┃<span style=\"font-weight: bold\"> subjectrole_326M </span>┃<span style=\"font-weight: bold\"> subjectrole_43M </span>┃<span style=\"font-weight: bold\"> totalamount_503A </span>┃<span style=\"font-weight: bold\"> totalamount_881A </span>┃\n┡━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>              │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>              │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>              │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>          │\n├────────────┼─────────┼──────────────┼─────────────────────────────┼───────────────────┼───────────────────────────┼─────────────────┼───────────────────┼───────────────┼──────────────┼──────────────────┼─────────────────┼────────────────────┼───────────────────┼───────────────────────┼────────────────┼──────────┼──────────┼─────────────┼──────────────────────┼─────────────────────┼────────────────────────┼────────────────────────┼──────────────────┼────────────────────────────┼─────────────────────────┼─────────────────┼──────────────────────────────┼─────────────────────┼───────────────────────┼────────────────────────────────┼───────────────────────────────┼────────────────────────┼────────────────────────┼──────────────────────┼────────────────┼────────────────────┼────────────────────┼──────────────────────┼─────────────────────┼─────────────────────────┼──────────────────┼─────────────────┼──────────────────┼──────────────────┤\n│          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">467</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ea6782cc                   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2011-06-15       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2031-06-13               </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">7241344e       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">724be82a         </span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.000000e+06</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10000.0</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.000000e+06</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P164_34_168    </span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2.0</span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │                  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.000</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-20     </span> │                         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                   │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1              </span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">96a8fdfe          </span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                 <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">fa4f56f1        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ab3c25cf       </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.000000e+06</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10000.0</span> │\n│          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">467</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ea6782cc                   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-04       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2021-08-04               </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">7241344e       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">724be82a         </span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.303650e+05</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">P164_34_168    </span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2.0</span> │                  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26571.969</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-20     </span> │                         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                   │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1              </span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">96a8fdfe          </span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ab3c25cf        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ab3c25cf       </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7.800000e+04</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">960000.0</span> │\n│          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">467</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">78000.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ea6782cc                   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2016-10-25       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-10-25               </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">7241344e       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">4257cbed         </span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">c5a72b57       </span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26571.969</span> │     <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │     <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                 <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">11.0</span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2016.0</span> │                   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2898.76</span> │                       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-10     </span> │                          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">36.0</span> │                   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                           <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">11.0</span> │                        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2016.0</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                   │ <span style=\"color: #008000; text-decoration-color: #008000\">a0b598e4              </span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">e914c86c      </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">10.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">96a8fdfe          </span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1       </span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │\n│          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1445</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ea6782cc                   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2015-01-30       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2021-01-30               </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">7241344e       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">1c9c5356         </span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.000000e+05</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100000.0</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7.400000e+04</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">b619fa46       </span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2.0</span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5.0</span> │                   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">200418.0</span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2018.0</span> │                    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.000</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-19     </span> │                          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.4</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.4</span> │                            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2.0</span> │                        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2018.0</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                   │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1              </span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">60c73645          </span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                 <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">73044.18</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">daf49a8a        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ab3c25cf       </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.000000e+05</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">100000.0</span> │\n│          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1445</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">01f63ac8                   </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2014-09-12       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2021-09-12               </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">7241344e       </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">724be82a         </span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.000000e+05</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">74bd67a8       </span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.0</span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">17.0</span> │                  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │             <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">209617.770</span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">2019-01-13     </span> │                         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                  <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                   │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1              </span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">96a8fdfe          </span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ab3c25cf        </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">ab3c25cf       </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.968006e+05</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">184587.8</span> │\n└────────────┴─────────┴──────────────┴─────────────────────────────┴───────────────────┴───────────────────────────┴─────────────────┴───────────────────┴───────────────┴──────────────┴──────────────────┴─────────────────┴────────────────────┴───────────────────┴───────────────────────┴────────────────┴──────────┴──────────┴─────────────┴──────────────────────┴─────────────────────┴────────────────────────┴────────────────────────┴──────────────────┴────────────────────────────┴─────────────────────────┴─────────────────┴──────────────────────────────┴─────────────────────┴───────────────────────┴────────────────────────────────┴───────────────────────────────┴────────────────────────┴────────────────────────┴──────────────────────┴────────────────┴────────────────────┴────────────────────┴──────────────────────┴─────────────────────┴─────────────────────────┴──────────────────┴─────────────────┴──────────────────┴──────────────────┘\n</pre>\n```\n:::\n:::\n\n\nFor more details on features and its exploratory data analysis (EDA), you can refer to\nfeature definition and these Kaggle notebooks:\n\n* [Feature\n  definition](https://www.kaggle.com/competitions/home-credit-credit-risk-model-stability/data#:~:text=calendar_view_week-,feature_definitions,-.csv)\n* [Home credit risk prediction\n  EDA](https://www.kaggle.com/code/loki97/home-credit-risk-prediction-eda)\n* [Home credit CRMS 2024\n  EDA](https://www.kaggle.com/code/sergiosaharovskiy/home-credit-crms-2024-eda-and-submission)\n\n### Data loading and processing\nWe will perform the following data processing steps using Ibis and IbisML:\n\n* **Convert data types**: Ensure consistency by converting data types, as the same column\n  in different sub-files may have different types.\n* **Aggregate features**: For tables with depth greater than 0, aggregate features based\n  on `case_id`, including statistics calculation. You can collect statistics such as mean,\n  median, mode, minimum, standard deviation, and others.\n* **Union and join datasets**: Combine multiple sub-files of the same dataset into one\n  table, as some datasets are split into multiple sub-files with a common prefix. Afterward,\n  join these tables with the base table.\n\n#### Convert data types\nWe'll use IbisML to create a chain of `Cast` steps, forming a recipe for data type\nconversion across the dataset. This conversion is based on the provided information\nextracted from column names. Columns that have similar transformations are indicated by a\ncapital letter at the end of their names:\n\n* P - Transform DPD (Days past due)\n* M - Masking categories\n* A - Transform amount\n* D - Transform date\n* T - Unspecified Transform\n* L - Unspecified Transform\n\nFor example, we'll define a IbisML transformation step to convert columns ends with `P`\nto floating number:\n\n::: {#92f70762 .cell execution_count=8}\n``` {.python .cell-code}\n# convert columns ends with P to floating number\nstep_cast_P_to_float = ml.Cast(ml.endswith(\"P\"), dt.float64)\n```\n:::\n\n\nNext, let's define additional type conversion transformations based on the postfix of column names:\n\n::: {#3e3be631 .cell execution_count=9}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to define more steps\"}\n# convert columns ends with A to floating number\nstep_cast_A_to_float = ml.Cast(ml.endswith(\"A\"), dt.float64)\n# convert columns ends with D to date\nstep_cast_D_to_date = ml.Cast(ml.endswith(\"D\"), dt.date)\n# convert columns ends with M to str\nstep_cast_M_to_str = ml.Cast(ml.endswith(\"M\"), dt.str)\n```\n:::\n\n\nWe'll construct the\n[IbisML Recipe](https://ibis-project.github.io/ibis-ml/reference/core.html#ibis_ml.Recipe)\nwhich chains together all the transformation steps.\n\n::: {#a275dbfd .cell execution_count=10}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to construct the recipe\"}\ndata_type_recipes = ml.Recipe(\n    step_cast_P_to_float,\n    step_cast_D_to_date,\n    step_cast_M_to_str,\n    step_cast_A_to_float,\n    # cast some special columns\n    ml.Cast([\"date_decision\"], \"date\"),\n    ml.Cast([\"case_id\", \"WEEK_NUM\", \"num_group1\", \"num_group2\"], dt.int64),\n    ml.Cast(\n        [\n            \"cardtype_51L\",\n            \"credacc_status_367L\",\n            \"requesttype_4525192L\",\n            \"riskassesment_302T\",\n            \"max_periodicityofpmts_997L\",\n        ],\n        dt.str,\n    ),\n    ml.Cast(\n        [\n            \"isbidproductrequest_292L\",\n            \"isdebitcard_527L\",\n            \"equalityempfrom_62L\",\n        ],\n        dt.int64,\n    ),\n)\nprint(f\"Data format conversion recipe:\\n{data_type_recipes}\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nData format conversion recipe:\nRecipe(Cast(endswith('P'), 'float64'),\n       Cast(endswith('D'), 'date'),\n       Cast(endswith('M'), 'string'),\n       Cast(endswith('A'), 'float64'),\n       Cast(cols(('date_decision',)), 'date'),\n       Cast(cols(('case_id', 'WEEK_NUM', 'num_group1', 'num_group2')), 'int64'),\n       Cast(cols(('cardtype_51L', 'credacc_status_367L', 'requesttype_4525192L', 'riskassesment_302T', 'max_periodicityofpmts_997L')),\n            'string'),\n       Cast(cols(('isbidproductrequest_292L', 'isdebitcard_527L', 'equalityempfrom_62L')),\n            'int64'))\n```\n:::\n:::\n\n\n::: {.callout-tip}\nIbisML offers a powerful set of column selectors, allowing you to select columns based\non names, types, and patterns. For more information, you can refer to the IbisML column\nselectors [documentation](https://ibis-project.github.io/ibis-ml/reference/selectors.html).\n:::\n\n#### Aggregate features\nFor tables with a depth greater than 0 that can't be directly joined with the base table,\nwe need to aggregate the features by the `case_id`. You could compute the different statistics for numeric columns and\nnon-numeric columns.\n\nHere, we use the `maximum` as an example.\n\n::: {#4f274968 .cell execution_count=11}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to aggregate features by case_id using max\"}\ndef agg_by_id(table):\n    return table.group_by(\"case_id\").agg(\n        [\n            table[col_name].max().name(f\"max_{col_name}\")\n            for col_name in table.columns\n            if col_name[-1] in (\"T\", \"L\", \"P\", \"A\", \"D\", \"M\")\n        ]\n    )\n```\n:::\n\n\n::: {.callout-tip}\nFor better predicting power, you need to collect different statistics based on the meaning of features. For simplicity,\nwe'll only collect the maximum value of the features here.\n:::\n\n#### Put them together\nWe'll put them together in a function reads parquet files, optionally handles regex patterns for\nmultiple sub-files, applies data type transformations defined by `data_type_recipes`, and\nperforms aggregation based on `case_id` if specified by the depth parameter.\n\n::: {#4116845d .cell execution_count=12}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to read and process data files\"}\ndef read_and_process_files(file_path, depth=None, is_regex=False):\n    \"\"\"\n    Read and process Parquet files.\n\n    Args:\n        file_path (str): Path to the file or regex pattern to match files.\n        depth (int, optional): Depth of processing. If 1 or 2, additional aggregation is performed.\n        is_regex (bool, optional): Whether the file_path is a regex pattern.\n\n    Returns:\n        ibis.Table: The processed Ibis table.\n    \"\"\"\n    if is_regex:\n        # read and union multiple files\n        chunks = []\n        for path in glob(str(file_path)):\n            chunk = ibis.read_parquet(path)\n            # transform table using IbisML Recipe\n            chunk = data_type_recipes.fit(chunk).to_ibis(chunk)\n            chunks.append(chunk)\n        table = ibis.union(*chunks)\n    else:\n        # read a single file\n        table = ibis.read_parquet(file_path)\n        # transform table using IbisML\n        table = data_type_recipes.fit(table).to_ibis(table)\n\n    # perform aggregation if depth is 1 or 2\n    if depth in [1, 2]:\n        table = agg_by_id(table)\n\n    return table\n```\n:::\n\n\nLet's define two dictionaries, `train_data_store` and `test_data_store`, that organize and\nstore processed datasets for training and testing datasets.\n\n::: {#513c202e .cell execution_count=13}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to load all data into a dict\"}\ntrain_data_store = {\n    \"df_base\": read_and_process_files(TRAIN_DIR / \"train_base.parquet\"),\n    \"depth_0\": [\n        read_and_process_files(TRAIN_DIR / \"train_static_cb_0.parquet\"),\n        read_and_process_files(TRAIN_DIR / \"train_static_0_*.parquet\", is_regex=True),\n    ],\n    \"depth_1\": [\n        read_and_process_files(\n            TRAIN_DIR / \"train_applprev_1_*.parquet\", 1, is_regex=True\n        ),\n        read_and_process_files(TRAIN_DIR / \"train_tax_registry_a_1.parquet\", 1),\n        read_and_process_files(TRAIN_DIR / \"train_tax_registry_b_1.parquet\", 1),\n        read_and_process_files(TRAIN_DIR / \"train_tax_registry_c_1.parquet\", 1),\n        read_and_process_files(TRAIN_DIR / \"train_credit_bureau_b_1.parquet\", 1),\n        read_and_process_files(TRAIN_DIR / \"train_other_1.parquet\", 1),\n        read_and_process_files(TRAIN_DIR / \"train_person_1.parquet\", 1),\n        read_and_process_files(TRAIN_DIR / \"train_deposit_1.parquet\", 1),\n        read_and_process_files(TRAIN_DIR / \"train_debitcard_1.parquet\", 1),\n    ],\n    \"depth_2\": [\n        read_and_process_files(TRAIN_DIR / \"train_credit_bureau_b_2.parquet\", 2),\n    ],\n}\n# we won't be submitting the predictions, so let's comment out the test data.\n# test_data_store = {\n#     \"df_base\": read_and_process_files(TEST_DIR / \"test_base.parquet\"),\n#     \"depth_0\": [\n#         read_and_process_files(TEST_DIR / \"test_static_cb_0.parquet\"),\n#         read_and_process_files(TEST_DIR / \"test_static_0_*.parquet\", is_regex=True),\n#     ],\n#     \"depth_1\": [\n#         read_and_process_files(TEST_DIR / \"test_applprev_1_*.parquet\", 1, is_regex=True),\n#         read_and_process_files(TEST_DIR / \"test_tax_registry_a_1.parquet\", 1),\n#         read_and_process_files(TEST_DIR / \"test_tax_registry_b_1.parquet\", 1),\n#         read_and_process_files(TEST_DIR / \"test_tax_registry_c_1.parquet\", 1),\n#         read_and_process_files(TEST_DIR / \"test_credit_bureau_b_1.parquet\", 1),\n#         read_and_process_files(TEST_DIR / \"test_other_1.parquet\", 1),\n#         read_and_process_files(TEST_DIR / \"test_person_1.parquet\", 1),\n#         read_and_process_files(TEST_DIR / \"test_deposit_1.parquet\", 1),\n#         read_and_process_files(TEST_DIR / \"test_debitcard_1.parquet\", 1),\n#     ],\n#     \"depth_2\": [\n#         read_and_process_files(TEST_DIR / \"test_credit_bureau_b_2.parquet\", 2),\n#     ]\n# }\n```\n:::\n\n\nJoin all features data to base table:\n\n::: {#93211f82 .cell execution_count=14}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Define function to join feature tables to base table\"}\ndef join_data(df_base, depth_0, depth_1, depth_2):\n    for i, df in enumerate(depth_0 + depth_1 + depth_2):\n        df_base = df_base.join(\n            df, \"case_id\", how=\"left\", rname=\"{name}_right\" + f\"_{i}\"\n        )\n    return df_base\n```\n:::\n\n\nGenerate train and test datasets:\n\n::: {#3b26eaf7 .cell execution_count=15}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to generate train and test datasets\"}\ndf_train = join_data(**train_data_store)\n# df_test = join_data(**test_data_store)\ntotal_rows = df_train.count().execute()\nprint(f\"There is {total_rows} rows and {len(df_train.columns)} columns\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nThere is 1526659 rows and 377 columns\n```\n:::\n:::\n\n\n### Select features\nGiven the large number of features (~370), we'll focus on selecting just a few of the most\ninformative ones by name for demonstration purposes in this post:\n\n::: {#36e0ec8e .cell execution_count=16}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to select important features for the train dataset\"}\ndf_train = df_train.select(\n    \"case_id\",\n    \"date_decision\",\n    \"target\",\n    # number of credit bureau queries for the last X days.\n    \"days30_165L\",\n    \"days360_512L\",\n    \"days90_310L\",\n    # number of tax deduction payments\n    \"pmtscount_423L\",\n    # sum of tax deductions for the client\n    \"pmtssum_45A\",\n    \"dateofbirth_337D\",\n    \"education_1103M\",\n    \"firstquarter_103L\",\n    \"secondquarter_766L\",\n    \"thirdquarter_1082L\",\n    \"fourthquarter_440L\",\n    \"maritalst_893M\",\n    \"numberofqueries_373L\",\n    \"requesttype_4525192L\",\n    \"responsedate_4527233D\",\n    \"actualdpdtolerance_344P\",\n    \"amtinstpaidbefduel24m_4187115A\",\n    \"annuity_780A\",\n    \"annuitynextmonth_57A\",\n    \"applicationcnt_361L\",\n    \"applications30d_658L\",\n    \"applicationscnt_1086L\",\n    # average days past or before due of payment during the last 24 months.\n    \"avgdbddpdlast24m_3658932P\",\n    # average days past or before due of payment during the last 3 months.\n    \"avgdbddpdlast3m_4187120P\",\n    # end date of active contract.\n    \"max_contractmaturitydate_151D\",\n    # credit limit of an active loan.\n    \"max_credlmt_1052A\",\n    # number of credits in credit bureau\n    \"max_credquantity_1099L\",\n    \"max_dpdmaxdatemonth_804T\",\n    \"max_dpdmaxdateyear_742T\",\n    \"max_maxdebtpduevalodued_3940955A\",\n    \"max_overdueamountmax_950A\",\n    \"max_purposeofcred_722M\",\n    \"max_residualamount_3940956A\",\n    \"max_totalamount_503A\",\n    \"max_cancelreason_3545846M\",\n    \"max_childnum_21L\",\n    \"max_currdebt_94A\",\n    \"max_employedfrom_700D\",\n    # client's main income amount in their previous application\n    \"max_mainoccupationinc_437A\",\n    \"max_profession_152M\",\n    \"max_rejectreason_755M\",\n    \"max_status_219L\",\n    # credit amount of the active contract provided by the credit bureau\n    \"max_amount_1115A\",\n    # amount of unpaid debt for existing contracts\n    \"max_debtpastduevalue_732A\",\n    \"max_debtvalue_227A\",\n    \"max_installmentamount_833A\",\n    \"max_instlamount_892A\",\n    \"max_numberofinstls_810L\",\n    \"max_pmtnumpending_403L\",\n    \"max_last180dayaveragebalance_704A\",\n    \"max_last30dayturnover_651A\",\n    \"max_openingdate_857D\",\n    \"max_amount_416A\",\n    \"max_amtdebitincoming_4809443A\",\n    \"max_amtdebitoutgoing_4809440A\",\n    \"max_amtdepositbalance_4809441A\",\n    \"max_amtdepositincoming_4809444A\",\n    \"max_amtdepositoutgoing_4809442A\",\n    \"max_empl_industry_691L\",\n    \"max_gender_992L\",\n    \"max_housingtype_772L\",\n    \"max_mainoccupationinc_384A\",\n    \"max_incometype_1044T\",\n)\n\ndf_train.head()\n```\n\n::: {.cell-output .cell-output-display execution_count=47}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> case_id </span>┃<span style=\"font-weight: bold\"> date_decision </span>┃<span style=\"font-weight: bold\"> target </span>┃<span style=\"font-weight: bold\"> days30_165L </span>┃<span style=\"font-weight: bold\"> days360_512L </span>┃<span style=\"font-weight: bold\"> days90_310L </span>┃<span style=\"font-weight: bold\"> pmtscount_423L </span>┃<span style=\"font-weight: bold\"> pmtssum_45A </span>┃<span style=\"font-weight: bold\"> dateofbirth_337D </span>┃<span style=\"font-weight: bold\"> education_1103M </span>┃<span style=\"font-weight: bold\"> firstquarter_103L </span>┃<span style=\"font-weight: bold\"> secondquarter_766L </span>┃<span style=\"font-weight: bold\"> thirdquarter_1082L </span>┃<span style=\"font-weight: bold\"> fourthquarter_440L </span>┃<span style=\"font-weight: bold\"> maritalst_893M </span>┃<span style=\"font-weight: bold\"> numberofqueries_373L </span>┃<span style=\"font-weight: bold\"> requesttype_4525192L </span>┃<span style=\"font-weight: bold\"> responsedate_4527233D </span>┃<span style=\"font-weight: bold\"> actualdpdtolerance_344P </span>┃<span style=\"font-weight: bold\"> amtinstpaidbefduel24m_4187115A </span>┃<span style=\"font-weight: bold\"> annuity_780A </span>┃<span style=\"font-weight: bold\"> annuitynextmonth_57A </span>┃<span style=\"font-weight: bold\"> applicationcnt_361L </span>┃<span style=\"font-weight: bold\"> applications30d_658L </span>┃<span style=\"font-weight: bold\"> applicationscnt_1086L </span>┃<span style=\"font-weight: bold\"> avgdbddpdlast24m_3658932P </span>┃<span style=\"font-weight: bold\"> avgdbddpdlast3m_4187120P </span>┃<span style=\"font-weight: bold\"> max_contractmaturitydate_151D </span>┃<span style=\"font-weight: bold\"> max_credlmt_1052A </span>┃<span style=\"font-weight: bold\"> max_credquantity_1099L </span>┃<span style=\"font-weight: bold\"> max_dpdmaxdatemonth_804T </span>┃<span style=\"font-weight: bold\"> max_dpdmaxdateyear_742T </span>┃<span style=\"font-weight: bold\"> max_maxdebtpduevalodued_3940955A </span>┃<span style=\"font-weight: bold\"> max_overdueamountmax_950A </span>┃<span style=\"font-weight: bold\"> max_purposeofcred_722M </span>┃<span style=\"font-weight: bold\"> max_residualamount_3940956A </span>┃<span style=\"font-weight: bold\"> max_totalamount_503A </span>┃<span style=\"font-weight: bold\"> max_cancelreason_3545846M </span>┃<span style=\"font-weight: bold\"> max_childnum_21L </span>┃<span style=\"font-weight: bold\"> max_currdebt_94A </span>┃<span style=\"font-weight: bold\"> max_employedfrom_700D </span>┃<span style=\"font-weight: bold\"> max_mainoccupationinc_437A </span>┃<span style=\"font-weight: bold\"> max_profession_152M </span>┃<span style=\"font-weight: bold\"> max_rejectreason_755M </span>┃<span style=\"font-weight: bold\"> max_status_219L </span>┃<span style=\"font-weight: bold\"> max_amount_1115A </span>┃<span style=\"font-weight: bold\"> max_debtpastduevalue_732A </span>┃<span style=\"font-weight: bold\"> max_debtvalue_227A </span>┃<span style=\"font-weight: bold\"> max_installmentamount_833A </span>┃<span style=\"font-weight: bold\"> max_instlamount_892A </span>┃<span style=\"font-weight: bold\"> max_numberofinstls_810L </span>┃<span style=\"font-weight: bold\"> max_pmtnumpending_403L </span>┃<span style=\"font-weight: bold\"> max_last180dayaveragebalance_704A </span>┃<span style=\"font-weight: bold\"> max_last30dayturnover_651A </span>┃<span style=\"font-weight: bold\"> max_openingdate_857D </span>┃<span style=\"font-weight: bold\"> max_amount_416A </span>┃<span style=\"font-weight: bold\"> max_amtdebitincoming_4809443A </span>┃<span style=\"font-weight: bold\"> max_amtdebitoutgoing_4809440A </span>┃<span style=\"font-weight: bold\"> max_amtdepositbalance_4809441A </span>┃<span style=\"font-weight: bold\"> max_amtdepositincoming_4809444A </span>┃<span style=\"font-weight: bold\"> max_amtdepositoutgoing_4809442A </span>┃<span style=\"font-weight: bold\"> max_empl_industry_691L </span>┃<span style=\"font-weight: bold\"> max_gender_992L </span>┃<span style=\"font-weight: bold\"> max_housingtype_772L </span>┃<span style=\"font-weight: bold\"> max_mainoccupationinc_384A </span>┃<span style=\"font-weight: bold\"> max_incometype_1044T    </span>┃\n┡━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">date</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">date</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>              │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">date</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>              │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>             │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>              │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">date</span>                          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>              │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">date</span>                  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>              │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>              │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                           │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">date</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                        │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                         │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>               │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>                    │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>                  │\n├─────────┼───────────────┼────────┼─────────────┼──────────────┼─────────────┼────────────────┼─────────────┼──────────────────┼─────────────────┼───────────────────┼────────────────────┼────────────────────┼────────────────────┼────────────────┼──────────────────────┼──────────────────────┼───────────────────────┼─────────────────────────┼────────────────────────────────┼──────────────┼──────────────────────┼─────────────────────┼──────────────────────┼───────────────────────┼───────────────────────────┼──────────────────────────┼───────────────────────────────┼───────────────────┼────────────────────────┼──────────────────────────┼─────────────────────────┼──────────────────────────────────┼───────────────────────────┼────────────────────────┼─────────────────────────────┼──────────────────────┼───────────────────────────┼──────────────────┼──────────────────┼───────────────────────┼────────────────────────────┼─────────────────────┼───────────────────────┼─────────────────┼──────────────────┼───────────────────────────┼────────────────────┼────────────────────────────┼──────────────────────┼─────────────────────────┼────────────────────────┼───────────────────────────────────┼────────────────────────────┼──────────────────────┼─────────────────┼───────────────────────────────┼───────────────────────────────┼────────────────────────────────┼─────────────────────────────────┼─────────────────────────────────┼────────────────────────┼─────────────────┼──────────────────────┼────────────────────────────┼─────────────────────────┤\n│ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1915559</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2020-09-02</span>    │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5.0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">1963-12-01</span>       │ <span style=\"color: #008000; text-decoration-color: #008000\">717ddd49       </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5.0</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                  │                     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">9490.187</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1366.6000</span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                 <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │                   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">-4.0</span> │                     <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                          │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                     <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                      <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                   │                        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1                 </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.000</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2012-11-15</span>            │                    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">72000.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1           </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1             </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">T              </span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                      <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │                    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">50000.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">PRIVATE_SECTOR_EMPLOYEE</span> │\n│ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1915592</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2020-09-02</span>    │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6.0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">1983-01-01</span>       │ <span style=\"color: #008000; text-decoration-color: #008000\">6b2ae0fa       </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6.0</span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">11.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6.0</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                  │                     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">61296.600</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1268.4000</span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                 <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">-6.0</span> │                     <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                          │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                     <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                      <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                   │                        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1                 </span> │              <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.000</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2013-09-15</span>            │                   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">199600.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1           </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1             </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">K              </span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                      <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │                    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">70000.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SALARIED_GOVT          </span> │\n│ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1915605</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2020-09-02</span>    │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">1977-01-01</span>       │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1       </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                  │                     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">359920.470</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8483.2000</span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6434.4</span> │                 <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">-15.0</span> │                     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">-6.0</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                          │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                     <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                      <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                   │                        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1                 </span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │        <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">43596.227</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2014-01-15</span>            │                   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">199600.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1           </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1             </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">T              </span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                      <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │                   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">200000.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SELFEMPLOYED           </span> │\n│ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1915620</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2020-09-02</span>    │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">1993-04-01</span>       │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1       </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                  │                     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">129430.370</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2368.2000</span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                 <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">-24.0</span> │                     <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                          │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                     <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                      <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                   │                        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1                 </span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.000</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2018-06-15</span>            │                    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30000.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1           </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1             </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">K              </span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                      <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │                    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">24000.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SALARIED_GOVT          </span> │\n│ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1915695</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2020-09-02</span>    │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │          <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │         <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">1981-07-01</span>       │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1       </span> │               <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1      </span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                  │                     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">15998.000</span> │    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6839.8003</span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                 <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │                     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">-10.0</span> │                     <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                          │              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                     <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                      <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                   │                        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1                 </span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │            <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.000</span> │ <span style=\"color: #800080; text-decoration-color: #800080\">2016-01-15</span>            │                    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">40000.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1           </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1             </span> │ <span style=\"color: #008000; text-decoration-color: #008000\">K              </span> │             <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                      <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │               <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                 <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                   <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                              <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                       <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                           <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │                            <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>            │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>                 │                    <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">30000.0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">SALARIED_GOVT          </span> │\n└─────────┴───────────────┴────────┴─────────────┴──────────────┴─────────────┴────────────────┴─────────────┴──────────────────┴─────────────────┴───────────────────┴────────────────────┴────────────────────┴────────────────────┴────────────────┴──────────────────────┴──────────────────────┴───────────────────────┴─────────────────────────┴────────────────────────────────┴──────────────┴──────────────────────┴─────────────────────┴──────────────────────┴───────────────────────┴───────────────────────────┴──────────────────────────┴───────────────────────────────┴───────────────────┴────────────────────────┴──────────────────────────┴─────────────────────────┴──────────────────────────────────┴───────────────────────────┴────────────────────────┴─────────────────────────────┴──────────────────────┴───────────────────────────┴──────────────────┴──────────────────┴───────────────────────┴────────────────────────────┴─────────────────────┴───────────────────────┴─────────────────┴──────────────────┴───────────────────────────┴────────────────────┴────────────────────────────┴──────────────────────┴─────────────────────────┴────────────────────────┴───────────────────────────────────┴────────────────────────────┴──────────────────────┴─────────────────┴───────────────────────────────┴───────────────────────────────┴────────────────────────────────┴─────────────────────────────────┴─────────────────────────────────┴────────────────────────┴─────────────────┴──────────────────────┴────────────────────────────┴─────────────────────────┘\n</pre>\n```\n:::\n:::\n\n\nUnivariate analysis:\n\n::: {#2b1a9226 .cell execution_count=17}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to describe the train dataset\"}\n# take the first 10 columns\ndf_train[df_train.columns[:10]].describe()\n```\n\n::: {.cell-output .cell-output-display execution_count=48}\n```{=html}\n<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓\n┃<span style=\"font-weight: bold\"> name            </span>┃<span style=\"font-weight: bold\"> pos   </span>┃<span style=\"font-weight: bold\"> type    </span>┃<span style=\"font-weight: bold\"> count   </span>┃<span style=\"font-weight: bold\"> nulls  </span>┃<span style=\"font-weight: bold\"> unique  </span>┃<span style=\"font-weight: bold\"> mode     </span>┃<span style=\"font-weight: bold\"> mean         </span>┃<span style=\"font-weight: bold\"> std           </span>┃<span style=\"font-weight: bold\"> min     </span>┃<span style=\"font-weight: bold\"> p25         </span>┃<span style=\"font-weight: bold\"> p50          </span>┃<span style=\"font-weight: bold\"> p75          </span>┃<span style=\"font-weight: bold\"> max          </span>┃\n┡━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩\n│ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>          │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int16</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>  │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">int64</span>   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">string</span>   │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>       │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>     │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">float64</span>      │\n├─────────────────┼───────┼─────────┼─────────┼────────┼─────────┼──────────┼──────────────┼───────────────┼─────────┼─────────────┼──────────────┼──────────────┼──────────────┤\n│ <span style=\"color: #008000; text-decoration-color: #008000\">case_id        </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">int64  </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1526659</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1526659</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.286077e+06</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">718946.592285</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">766197.5000</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.357358e+06</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.739022e+06</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2.703454e+06</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">target         </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">int64  </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1526659</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.143728e-02</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.174496</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0000</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.000000e+00</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.000000e+00</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.000000e+00</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">days30_165L    </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">float64</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1526659</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">140968</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">22</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5.177078e-01</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.899238</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0000</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.000000e+00</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.000000e+00</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2.200000e+01</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">days360_512L   </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">float64</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1526659</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">140968</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">92</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.777066e+00</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5.168856</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.0000</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.000000e+00</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6.500000e+00</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.150000e+02</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">days90_310L    </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">float64</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1526659</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">140968</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">37</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.211420e+00</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.655931</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0000</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.000000e+00</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2.000000e+00</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.100000e+01</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">pmtscount_423L </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">float64</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1526659</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">954021</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">66</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5.839291e+00</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.148264</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │      <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3.0000</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">6.000000e+00</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7.000000e+00</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.210000e+02</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">pmtssum_45A    </span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">7</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">float64</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1526659</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">954021</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">265229</span> │ <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span>     │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.319994e+04</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">18117.218312</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">0.0</span> │   <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">3156.4001</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">8.391900e+03</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1.699200e+04</span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">4.768434e+05</span> │\n│ <span style=\"color: #008000; text-decoration-color: #008000\">education_1103M</span> │     <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">9</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">string </span> │ <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1526659</span> │  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">26183</span> │       <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">5</span> │ <span style=\"color: #008000; text-decoration-color: #008000\">a55475b1</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │          <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │    <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │        <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │         <span style=\"color: #7f7f7f; text-decoration-color: #7f7f7f\">NULL</span> │\n└─────────────────┴───────┴─────────┴─────────┴────────┴─────────┴──────────┴──────────────┴───────────────┴─────────┴─────────────┴──────────────┴──────────────┴──────────────┘\n</pre>\n```\n:::\n:::\n\n\n## Last-mile data preprocessing\nWe will perform the following transformation before feeding the data to models:\n\n* Missing value imputation\n* Encoding categorical variables\n* Handling date variables\n* Handling outliers\n* Scaling and normalization\n\n::: {.callout-note}\nIbisML provides a set of transformations. You can find the\n[roadmap](https://github.com/ibis-project/ibis-ml/issues/32).\nThe [IbisML website](https://ibis-project.github.io/ibis-ml/) also includes tutorials and API documentation.\n:::\n\n### Impute features\nImpute all numeric columns using the median. In real-life scenarios, it's important to\nunderstand the meaning of each feature and apply the appropriate imputation method for\ndifferent features. For more imputations, please refer to this\n[documentation](https://ibis-project.github.io/ibis-ml/reference/steps-imputation.html).\n\n::: {#99303c5a .cell execution_count=18}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to impute all numeric columns with median\"}\nstep_impute_median = ml.ImputeMedian(ml.numeric())\n```\n:::\n\n\n### Encode categorical features\nEncode all categorical features using one-hot-encode. For more encoding steps,\nplease refer to this\n[doc](https://ibis-project.github.io/ibis-ml/reference/steps-encoding.html).\n\n::: {#3e1df3b1 .cell execution_count=19}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to one-hot encode selected columns\"}\nohe_step = ml.OneHotEncode(\n    [\n        \"maritalst_893M\",\n        \"requesttype_4525192L\",\n        \"max_profession_152M\",\n        \"max_gender_992L\",\n        \"max_empl_industry_691L\",\n        \"max_housingtype_772L\",\n        \"max_incometype_1044T\",\n        \"max_cancelreason_3545846M\",\n        \"max_rejectreason_755M\",\n        \"education_1103M\",\n        \"max_status_219L\",\n    ]\n)\n```\n:::\n\n\n### Handle date variables\nCalculate all the days difference between any date columns and the column `date_decision`:\n\n::: {#cb197a80 .cell execution_count=20}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to calculate days difference between date columns and date_decision\"}\ndate_cols = [col_name for col_name in df_train.columns if col_name[-1] == \"D\"]\ndays_to_decision_expr = {\n    # difference in days\n    f\"{col}_date_decision_diff\": (\n        _.date_decision.epoch_seconds() - getattr(_, col).epoch_seconds()\n    )\n    / (60 * 60 * 24)\n    for col in date_cols\n}\ndays_to_decision_step = ml.Mutate(days_to_decision_expr)\n```\n:::\n\n\nExtract information from the date columns:\n\n::: {#63129062 .cell execution_count=21}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to extract day and week info from date columns\"}\n# dow and month is set to catagoery\nexpand_date_step = ml.ExpandDate(ml.date(), [\"week\", \"day\"])\n```\n:::\n\n\n### Handle outliers\nCapping outliers using `z-score` method:\n\n::: {#fe15771d .cell execution_count=22}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to cap outliers for selected columns\"}\nstep_handle_outliers = ml.HandleUnivariateOutliers(\n    [\"max_amount_1115A\", \"max_overdueamountmax_950A\"],\n    method=\"z-score\",\n    treatment=\"capping\",\n    deviation_factor=3,\n)\n```\n:::\n\n\n### Construct recipe\nWe'll construct the last mile preprocessing [recipe](https://ibis-project.github.io/ibis-ml/reference/core.html#ibis_ml.Recipe)\nby chaining all transformation steps, which will be fitted to the training dataset and later applied test datasets.\n\n::: {#fd8a067a .cell execution_count=23}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to construct the recipe\"}\nlast_mile_preprocessing = ml.Recipe(\n    expand_date_step,\n    ml.Drop(ml.date()),\n    # handle string columns\n    ohe_step,\n    ml.Drop(ml.string()),\n    # handle numeric cols\n    # capping outliers\n    step_handle_outliers,\n    step_impute_median,\n    ml.ScaleMinMax(ml.numeric()),\n    # fill missing value\n    ml.FillNA(ml.numeric(), 0),\n    ml.Cast(ml.numeric(), \"float32\"),\n)\nprint(f\"Last-mile preprocessing recipe: \\n{last_mile_preprocessing}\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nLast-mile preprocessing recipe: \nRecipe(ExpandDate(date(), components=['week', 'day']),\n       Drop(date()),\n       OneHotEncode(cols(('maritalst_893M', 'requesttype_4525192L', 'max_profession_152M', 'max_gender_992L', 'max_empl_industry_691L', 'max_housingtype_772L', 'max_incometype_1044T', 'max_cancelreason_3545846M', 'max_rejectreason_755M', 'education_1103M', 'max_status_219L'))),\n       Drop(string()),\n       HandleUnivariateOutliers(cols(('max_amount_1115A', 'max_overdueamountmax_950A')),\n                                method='z-score',\n                                deviation_factor=3,\n                                treatment='capping'),\n       ImputeMedian(numeric()),\n       ScaleMinMax(numeric()),\n       FillNA(numeric(), 0),\n       Cast(numeric(), 'float32'))\n```\n:::\n:::\n\n\n## Modeling\nAfter completing data preprocessing with Ibis and IbisML, we proceed to the modeling\nphase. Here are two approaches:\n\n* Use IbisML as a independent data preprocessing component and hand off the data to downstream modeling\nframeworks with various output formats:\n     - pandas Dataframe\n     - NumPy Array\n     - Polars Dataframe\n     - Dask Dataframe\n     - xgboost.DMatrix\n     - Pyarrow Table\n* Use IbisML recipes as components within an sklearn Pipeline and\ntrain models similarly to how you would do with sklearn pipeline.\n\nWe will build an XGBoost model within a scikit-learn pipeline, and a neural network classifier using the\noutput transformed by IbisML recipes.\n\n### Train and test data splitting\nWe'll use hashing on the unique key to consistently split rows to different groups.\nHashing is robust to underlying changes in the data, such as adding, deleting, or\nreordering rows. This deterministic process ensures that each data point is always\nassigned to the same split, thereby enhancing reproducibility.\n\n::: {#07531183 .cell execution_count=24}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to split data into train and test\"}\nimport random\n\n# this enables the analysis to be reproducible when random numbers are used\nrandom.seed(222)\nrandom_key = str(random.getrandbits(256))\n\n# put 3/4 of the data into the training set\ndf_train = df_train.mutate(\n    train_flag=(df_train.case_id.cast(dt.str) + random_key).hash().abs() % 4 < 3\n)\n# split the dataset by train_flag\n# todo: use ml.train_test_split() after next release\ntrain_data = df_train[df_train.train_flag].drop(\"train_flag\")\ntest_data = df_train[~df_train.train_flag].drop(\"train_flag\")\n\nX_train = train_data.drop(\"target\")\ny_train = train_data.target.cast(dt.float32).name(\"target\")\n\nX_test = test_data.drop(\"target\")\ny_test = test_data.target.cast(dt.float32).name(\"target\")\n\ntrain_cnt = X_train.count().execute()\ntest_cnt = X_test.count().execute()\nprint(f\"train dataset size = {train_cnt} \\ntest data size = {test_cnt}\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\ntrain dataset size = 1144339 \ntest data size = 382320\n```\n:::\n:::\n\n\n::: {.callout-warning}\nHashing provides a consistent but pseudo-random distribution of data, which\nmay not precisely align with the specified train/test ratio. While hash codes\nensure reproducibility, they don't guarantee an exact split. Due to statistical variance,\nyou might find a slight imbalance in the distribution, resulting in marginally more or\nfewer samples in either the training or test dataset than the target percentage. This\nminor deviation from the intended ratio is a normal consequence of hash-based\npartitioning.\n:::\n\n### XGBoost\nIn this section, we integrate XGBoost into a scikit-learn pipeline to create a\nstreamlined workflow for training and evaluating our model.\n\nWe'll set up a pipeline that includes two components:\n\n* **Preprocessing**: This step applies the `last_mile_preprocessing` for final data preprocessing.\n* **Modeling**: This step applies the `xgb.XGBClassifier()` to train the XGBoost model.\n\n::: {#7a5ff114 .cell execution_count=25}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to built and fit the pipeline\"}\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.metrics import roc_auc_score\nimport xgboost as xgb\n\nmodel = xgb.XGBClassifier(\n    n_estimators=100,\n    max_depth=5,\n    learning_rate=0.05,\n    subsample=0.8,\n    colsample_bytree=0.8,\n    random_state=42,\n)\n# create the pipeline with the last mile ML recipes and the model\npipe = Pipeline([(\"last_mile_recipes\", last_mile_preprocessing), (\"model\", model)])\n# fit the pipeline on the training data\npipe.fit(X_train, y_train)\n```\n\n::: {.cell-output .cell-output-display execution_count=56}\n```{=html}\n<style>#sk-container-id-2 {\n  /* Definition of color scheme common for light and dark mode */\n  --sklearn-color-text: black;\n  --sklearn-color-line: gray;\n  /* Definition of color scheme for unfitted estimators */\n  --sklearn-color-unfitted-level-0: #fff5e6;\n  --sklearn-color-unfitted-level-1: #f6e4d2;\n  --sklearn-color-unfitted-level-2: #ffe0b3;\n  --sklearn-color-unfitted-level-3: chocolate;\n  /* Definition of color scheme for fitted estimators */\n  --sklearn-color-fitted-level-0: #f0f8ff;\n  --sklearn-color-fitted-level-1: #d4ebff;\n  --sklearn-color-fitted-level-2: #b3dbfd;\n  --sklearn-color-fitted-level-3: cornflowerblue;\n\n  /* Specific color for light theme */\n  --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n  --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n  --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n  --sklearn-color-icon: #696969;\n\n  @media (prefers-color-scheme: dark) {\n    /* Redefinition of color scheme for dark theme */\n    --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n    --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n    --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n    --sklearn-color-icon: #878787;\n  }\n}\n\n#sk-container-id-2 {\n  color: var(--sklearn-color-text);\n}\n\n#sk-container-id-2 pre {\n  padding: 0;\n}\n\n#sk-container-id-2 input.sk-hidden--visually {\n  border: 0;\n  clip: rect(1px 1px 1px 1px);\n  clip: rect(1px, 1px, 1px, 1px);\n  height: 1px;\n  margin: -1px;\n  overflow: hidden;\n  padding: 0;\n  position: absolute;\n  width: 1px;\n}\n\n#sk-container-id-2 div.sk-dashed-wrapped {\n  border: 1px dashed var(--sklearn-color-line);\n  margin: 0 0.4em 0.5em 0.4em;\n  box-sizing: border-box;\n  padding-bottom: 0.4em;\n  background-color: var(--sklearn-color-background);\n}\n\n#sk-container-id-2 div.sk-container {\n  /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n     but bootstrap.min.css set `[hidden] { display: none !important; }`\n     so we also need the `!important` here to be able to override the\n     default hidden behavior on the sphinx rendered scikit-learn.org.\n     See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n  display: inline-block !important;\n  position: relative;\n}\n\n#sk-container-id-2 div.sk-text-repr-fallback {\n  display: none;\n}\n\ndiv.sk-parallel-item,\ndiv.sk-serial,\ndiv.sk-item {\n  /* draw centered vertical line to link estimators */\n  background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n  background-size: 2px 100%;\n  background-repeat: no-repeat;\n  background-position: center center;\n}\n\n/* Parallel-specific style estimator block */\n\n#sk-container-id-2 div.sk-parallel-item::after {\n  content: \"\";\n  width: 100%;\n  border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n  flex-grow: 1;\n}\n\n#sk-container-id-2 div.sk-parallel {\n  display: flex;\n  align-items: stretch;\n  justify-content: center;\n  background-color: var(--sklearn-color-background);\n  position: relative;\n}\n\n#sk-container-id-2 div.sk-parallel-item {\n  display: flex;\n  flex-direction: column;\n}\n\n#sk-container-id-2 div.sk-parallel-item:first-child::after {\n  align-self: flex-end;\n  width: 50%;\n}\n\n#sk-container-id-2 div.sk-parallel-item:last-child::after {\n  align-self: flex-start;\n  width: 50%;\n}\n\n#sk-container-id-2 div.sk-parallel-item:only-child::after {\n  width: 0;\n}\n\n/* Serial-specific style estimator block */\n\n#sk-container-id-2 div.sk-serial {\n  display: flex;\n  flex-direction: column;\n  align-items: center;\n  background-color: var(--sklearn-color-background);\n  padding-right: 1em;\n  padding-left: 1em;\n}\n\n\n/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\nclickable and can be expanded/collapsed.\n- Pipeline and ColumnTransformer use this feature and define the default style\n- Estimators will overwrite some part of the style using the `sk-estimator` class\n*/\n\n/* Pipeline and ColumnTransformer style (default) */\n\n#sk-container-id-2 div.sk-toggleable {\n  /* Default theme specific background. It is overwritten whether we have a\n  specific estimator or a Pipeline/ColumnTransformer */\n  background-color: var(--sklearn-color-background);\n}\n\n/* Toggleable label */\n#sk-container-id-2 label.sk-toggleable__label {\n  cursor: pointer;\n  display: block;\n  width: 100%;\n  margin-bottom: 0;\n  padding: 0.5em;\n  box-sizing: border-box;\n  text-align: center;\n}\n\n#sk-container-id-2 label.sk-toggleable__label-arrow:before {\n  /* Arrow on the left of the label */\n  content: \"▸\";\n  float: left;\n  margin-right: 0.25em;\n  color: var(--sklearn-color-icon);\n}\n\n#sk-container-id-2 label.sk-toggleable__label-arrow:hover:before {\n  color: var(--sklearn-color-text);\n}\n\n/* Toggleable content - dropdown */\n\n#sk-container-id-2 div.sk-toggleable__content {\n  max-height: 0;\n  max-width: 0;\n  overflow: hidden;\n  text-align: left;\n  /* unfitted */\n  background-color: var(--sklearn-color-unfitted-level-0);\n}\n\n#sk-container-id-2 div.sk-toggleable__content.fitted {\n  /* fitted */\n  background-color: var(--sklearn-color-fitted-level-0);\n}\n\n#sk-container-id-2 div.sk-toggleable__content pre {\n  margin: 0.2em;\n  border-radius: 0.25em;\n  color: var(--sklearn-color-text);\n  /* unfitted */\n  background-color: var(--sklearn-color-unfitted-level-0);\n}\n\n#sk-container-id-2 div.sk-toggleable__content.fitted pre {\n  /* unfitted */\n  background-color: var(--sklearn-color-fitted-level-0);\n}\n\n#sk-container-id-2 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n  /* Expand drop-down */\n  max-height: 200px;\n  max-width: 100%;\n  overflow: auto;\n}\n\n#sk-container-id-2 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n  content: \"▾\";\n}\n\n/* Pipeline/ColumnTransformer-specific style */\n\n#sk-container-id-2 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n  color: var(--sklearn-color-text);\n  background-color: var(--sklearn-color-unfitted-level-2);\n}\n\n#sk-container-id-2 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n  background-color: var(--sklearn-color-fitted-level-2);\n}\n\n/* Estimator-specific style */\n\n/* Colorize estimator box */\n#sk-container-id-2 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n  /* unfitted */\n  background-color: var(--sklearn-color-unfitted-level-2);\n}\n\n#sk-container-id-2 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n  /* fitted */\n  background-color: var(--sklearn-color-fitted-level-2);\n}\n\n#sk-container-id-2 div.sk-label label.sk-toggleable__label,\n#sk-container-id-2 div.sk-label label {\n  /* The background is the default theme color */\n  color: var(--sklearn-color-text-on-default-background);\n}\n\n/* On hover, darken the color of the background */\n#sk-container-id-2 div.sk-label:hover label.sk-toggleable__label {\n  color: var(--sklearn-color-text);\n  background-color: var(--sklearn-color-unfitted-level-2);\n}\n\n/* Label box, darken color on hover, fitted */\n#sk-container-id-2 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n  color: var(--sklearn-color-text);\n  background-color: var(--sklearn-color-fitted-level-2);\n}\n\n/* Estimator label */\n\n#sk-container-id-2 div.sk-label label {\n  font-family: monospace;\n  font-weight: bold;\n  display: inline-block;\n  line-height: 1.2em;\n}\n\n#sk-container-id-2 div.sk-label-container {\n  text-align: center;\n}\n\n/* Estimator-specific */\n#sk-container-id-2 div.sk-estimator {\n  font-family: monospace;\n  border: 1px dotted var(--sklearn-color-border-box);\n  border-radius: 0.25em;\n  box-sizing: border-box;\n  margin-bottom: 0.5em;\n  /* unfitted */\n  background-color: var(--sklearn-color-unfitted-level-0);\n}\n\n#sk-container-id-2 div.sk-estimator.fitted {\n  /* fitted */\n  background-color: var(--sklearn-color-fitted-level-0);\n}\n\n/* on hover */\n#sk-container-id-2 div.sk-estimator:hover {\n  /* unfitted */\n  background-color: var(--sklearn-color-unfitted-level-2);\n}\n\n#sk-container-id-2 div.sk-estimator.fitted:hover {\n  /* fitted */\n  background-color: var(--sklearn-color-fitted-level-2);\n}\n\n/* Specification for estimator info (e.g. \"i\" and \"?\") */\n\n/* Common style for \"i\" and \"?\" */\n\n.sk-estimator-doc-link,\na:link.sk-estimator-doc-link,\na:visited.sk-estimator-doc-link {\n  float: right;\n  font-size: smaller;\n  line-height: 1em;\n  font-family: monospace;\n  background-color: var(--sklearn-color-background);\n  border-radius: 1em;\n  height: 1em;\n  width: 1em;\n  text-decoration: none !important;\n  margin-left: 1ex;\n  /* unfitted */\n  border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n  color: var(--sklearn-color-unfitted-level-1);\n}\n\n.sk-estimator-doc-link.fitted,\na:link.sk-estimator-doc-link.fitted,\na:visited.sk-estimator-doc-link.fitted {\n  /* fitted */\n  border: var(--sklearn-color-fitted-level-1) 1pt solid;\n  color: var(--sklearn-color-fitted-level-1);\n}\n\n/* On hover */\ndiv.sk-estimator:hover .sk-estimator-doc-link:hover,\n.sk-estimator-doc-link:hover,\ndiv.sk-label-container:hover .sk-estimator-doc-link:hover,\n.sk-estimator-doc-link:hover {\n  /* unfitted */\n  background-color: var(--sklearn-color-unfitted-level-3);\n  color: var(--sklearn-color-background);\n  text-decoration: none;\n}\n\ndiv.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n.sk-estimator-doc-link.fitted:hover,\ndiv.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n.sk-estimator-doc-link.fitted:hover {\n  /* fitted */\n  background-color: var(--sklearn-color-fitted-level-3);\n  color: var(--sklearn-color-background);\n  text-decoration: none;\n}\n\n/* Span, style for the box shown on hovering the info icon */\n.sk-estimator-doc-link span {\n  display: none;\n  z-index: 9999;\n  position: relative;\n  font-weight: normal;\n  right: .2ex;\n  padding: .5ex;\n  margin: .5ex;\n  width: min-content;\n  min-width: 20ex;\n  max-width: 50ex;\n  color: var(--sklearn-color-text);\n  box-shadow: 2pt 2pt 4pt #999;\n  /* unfitted */\n  background: var(--sklearn-color-unfitted-level-0);\n  border: .5pt solid var(--sklearn-color-unfitted-level-3);\n}\n\n.sk-estimator-doc-link.fitted span {\n  /* fitted */\n  background: var(--sklearn-color-fitted-level-0);\n  border: var(--sklearn-color-fitted-level-3);\n}\n\n.sk-estimator-doc-link:hover span {\n  display: block;\n}\n\n/* \"?\"-specific style due to the `<a>` HTML tag */\n\n#sk-container-id-2 a.estimator_doc_link {\n  float: right;\n  font-size: 1rem;\n  line-height: 1em;\n  font-family: monospace;\n  background-color: var(--sklearn-color-background);\n  border-radius: 1rem;\n  height: 1rem;\n  width: 1rem;\n  text-decoration: none;\n  /* unfitted */\n  color: var(--sklearn-color-unfitted-level-1);\n  border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n}\n\n#sk-container-id-2 a.estimator_doc_link.fitted {\n  /* fitted */\n  border: var(--sklearn-color-fitted-level-1) 1pt solid;\n  color: var(--sklearn-color-fitted-level-1);\n}\n\n/* On hover */\n#sk-container-id-2 a.estimator_doc_link:hover {\n  /* unfitted */\n  background-color: var(--sklearn-color-unfitted-level-3);\n  color: var(--sklearn-color-background);\n  text-decoration: none;\n}\n\n#sk-container-id-2 a.estimator_doc_link.fitted:hover {\n  /* fitted */\n  background-color: var(--sklearn-color-fitted-level-3);\n}\n</style><div id=\"sk-container-id-2\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>Pipeline(steps=[(&#x27;last_mile_recipes&#x27;,\n                 Recipe(ExpandDate(date(), components=[&#x27;week&#x27;, &#x27;day&#x27;]),\n                        Drop(date()),\n                        OneHotEncode(cols((&#x27;maritalst_893M&#x27;, &#x27;requesttype_4525192L&#x27;, &#x27;max_profession_152M&#x27;, &#x27;max_gender_992L&#x27;, &#x27;max_empl_industry_691L&#x27;, &#x27;max_housingtype_772L&#x27;, &#x27;max_incometype_1044T&#x27;, &#x27;max_cancelreason_3545846M&#x27;, &#x27;max_rejectreason_755M&#x27;, &#x27;education_1103M&#x27;, &#x27;max_sta...\n                               feature_types=None, gamma=None, grow_policy=None,\n                               importance_type=None,\n                               interaction_constraints=None, learning_rate=0.05,\n                               max_bin=None, max_cat_threshold=None,\n                               max_cat_to_onehot=None, max_delta_step=None,\n                               max_depth=5, max_leaves=None,\n                               min_child_weight=None, missing=nan,\n                               monotone_constraints=None, multi_strategy=None,\n                               n_estimators=100, n_jobs=None,\n                               num_parallel_tree=None, random_state=42, ...))])</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item sk-dashed-wrapped\"><div class=\"sk-label-container\"><div class=\"sk-label fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-13\" type=\"checkbox\" ><label for=\"sk-estimator-id-13\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;&nbsp;Pipeline<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://scikit-learn.org/1.4/modules/generated/sklearn.pipeline.Pipeline.html\">?<span>Documentation for Pipeline</span></a><span class=\"sk-estimator-doc-link fitted\">i<span>Fitted</span></span></label><div class=\"sk-toggleable__content fitted\"><pre>Pipeline(steps=[(&#x27;last_mile_recipes&#x27;,\n                 Recipe(ExpandDate(date(), components=[&#x27;week&#x27;, &#x27;day&#x27;]),\n                        Drop(date()),\n                        OneHotEncode(cols((&#x27;maritalst_893M&#x27;, &#x27;requesttype_4525192L&#x27;, &#x27;max_profession_152M&#x27;, &#x27;max_gender_992L&#x27;, &#x27;max_empl_industry_691L&#x27;, &#x27;max_housingtype_772L&#x27;, &#x27;max_incometype_1044T&#x27;, &#x27;max_cancelreason_3545846M&#x27;, &#x27;max_rejectreason_755M&#x27;, &#x27;education_1103M&#x27;, &#x27;max_sta...\n                               feature_types=None, gamma=None, grow_policy=None,\n                               importance_type=None,\n                               interaction_constraints=None, learning_rate=0.05,\n                               max_bin=None, max_cat_threshold=None,\n                               max_cat_to_onehot=None, max_delta_step=None,\n                               max_depth=5, max_leaves=None,\n                               min_child_weight=None, missing=nan,\n                               monotone_constraints=None, multi_strategy=None,\n                               n_estimators=100, n_jobs=None,\n                               num_parallel_tree=None, random_state=42, ...))])</pre></div> </div></div><div class=\"sk-serial\"><div class=\"sk-item\"><div class=\"sk-label-container\"><div class=\"sk-label fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-14\" type=\"checkbox\" ><label for=\"sk-estimator-id-14\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">last_mile_recipes: Recipe</label><div class=\"sk-toggleable__content fitted\"><pre>Recipe(ExpandDate(date(), components=[&#x27;week&#x27;, &#x27;day&#x27;]),\n       Drop(date()),\n       OneHotEncode(cols((&#x27;maritalst_893M&#x27;, &#x27;requesttype_4525192L&#x27;, &#x27;max_profession_152M&#x27;, &#x27;max_gender_992L&#x27;, &#x27;max_empl_industry_691L&#x27;, &#x27;max_housingtype_772L&#x27;, &#x27;max_incometype_1044T&#x27;, &#x27;max_cancelreason_3545846M&#x27;, &#x27;max_rejectreason_755M&#x27;, &#x27;education_1103M&#x27;, &#x27;max_status_219L&#x27;))),\n       Drop(string()),\n       HandleUnivariateOutliers(cols((&#x27;max_amount_1115A&#x27;, &#x27;max_overdueamountmax_950A&#x27;)),\n                                method=&#x27;z-score&#x27;,\n                                deviation_factor=3,\n                                treatment=&#x27;capping&#x27;),\n       ImputeMedian(numeric()),\n       ScaleMinMax(numeric()),\n       FillNA(numeric(), 0),\n       Cast(numeric(), &#x27;float32&#x27;))</pre></div> </div></div><div class=\"sk-serial\"><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-15\" type=\"checkbox\" ><label for=\"sk-estimator-id-15\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;ExpandDate<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://ibis-project.github.io/ibis-ml/reference/steps-temporal-feature-extraction.html#ibis_ml.ExpandDate\">?<span>Documentation for ExpandDate</span></a></label><div class=\"sk-toggleable__content fitted\"><pre>ExpandDate(date(), components=[&#x27;week&#x27;, &#x27;day&#x27;])</pre></div> </div></div><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-16\" type=\"checkbox\" ><label for=\"sk-estimator-id-16\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;Drop<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://ibis-project.github.io/ibis-ml/reference/steps-other.html#ibis_ml.Drop\">?<span>Documentation for Drop</span></a></label><div class=\"sk-toggleable__content fitted\"><pre>Drop(date())</pre></div> </div></div><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-17\" type=\"checkbox\" ><label for=\"sk-estimator-id-17\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;OneHotEncode<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://ibis-project.github.io/ibis-ml/reference/steps-encoding.html#ibis_ml.OneHotEncode\">?<span>Documentation for OneHotEncode</span></a></label><div class=\"sk-toggleable__content fitted\"><pre>OneHotEncode(cols((&#x27;maritalst_893M&#x27;, &#x27;requesttype_4525192L&#x27;, &#x27;max_profession_152M&#x27;, &#x27;max_gender_992L&#x27;, &#x27;max_empl_industry_691L&#x27;, &#x27;max_housingtype_772L&#x27;, &#x27;max_incometype_1044T&#x27;, &#x27;max_cancelreason_3545846M&#x27;, &#x27;max_rejectreason_755M&#x27;, &#x27;education_1103M&#x27;, &#x27;max_status_219L&#x27;)))</pre></div> </div></div><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-18\" type=\"checkbox\" ><label for=\"sk-estimator-id-18\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;Drop<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://ibis-project.github.io/ibis-ml/reference/steps-other.html#ibis_ml.Drop\">?<span>Documentation for Drop</span></a></label><div class=\"sk-toggleable__content fitted\"><pre>Drop(string())</pre></div> </div></div><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-19\" type=\"checkbox\" ><label for=\"sk-estimator-id-19\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;HandleUnivariateOutliers<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://ibis-project.github.io/ibis-ml/reference/steps-outlier-handling.html#ibis_ml.HandleUnivariateOutliers\">?<span>Documentation for HandleUnivariateOutliers</span></a></label><div class=\"sk-toggleable__content fitted\"><pre>HandleUnivariateOutliers(cols((&#x27;max_amount_1115A&#x27;, &#x27;max_overdueamountmax_950A&#x27;)),\n                         method=&#x27;z-score&#x27;,\n                         deviation_factor=3,\n                         treatment=&#x27;capping&#x27;)</pre></div> </div></div><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-20\" type=\"checkbox\" ><label for=\"sk-estimator-id-20\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;ImputeMedian<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://ibis-project.github.io/ibis-ml/reference/steps-imputation.html#ibis_ml.ImputeMedian\">?<span>Documentation for ImputeMedian</span></a></label><div class=\"sk-toggleable__content fitted\"><pre>ImputeMedian(numeric())</pre></div> </div></div><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-21\" type=\"checkbox\" ><label for=\"sk-estimator-id-21\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;ScaleMinMax<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://ibis-project.github.io/ibis-ml/reference/steps-standardization.html#ibis_ml.ScaleMinMax\">?<span>Documentation for ScaleMinMax</span></a></label><div class=\"sk-toggleable__content fitted\"><pre>ScaleMinMax(numeric())</pre></div> </div></div><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-22\" type=\"checkbox\" ><label for=\"sk-estimator-id-22\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;FillNA<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://ibis-project.github.io/ibis-ml/reference/steps-imputation.html#ibis_ml.FillNA\">?<span>Documentation for FillNA</span></a></label><div class=\"sk-toggleable__content fitted\"><pre>FillNA(numeric(), 0)</pre></div> </div></div><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-23\" type=\"checkbox\" ><label for=\"sk-estimator-id-23\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;Cast<a class=\"sk-estimator-doc-link fitted\" rel=\"noreferrer\" target=\"_blank\" href=\"https://ibis-project.github.io/ibis-ml/reference/steps-other.html#ibis_ml.Cast\">?<span>Documentation for Cast</span></a></label><div class=\"sk-toggleable__content fitted\"><pre>Cast(numeric(), &#x27;float32&#x27;)</pre></div> </div></div></div></div><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-24\" type=\"checkbox\" ><label for=\"sk-estimator-id-24\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">XGBClassifier</label><div class=\"sk-toggleable__content fitted\"><pre>XGBClassifier(base_score=None, booster=None, callbacks=None,\n              colsample_bylevel=None, colsample_bynode=None,\n              colsample_bytree=0.8, device=None, early_stopping_rounds=None,\n              enable_categorical=False, eval_metric=None, feature_types=None,\n              gamma=None, grow_policy=None, importance_type=None,\n              interaction_constraints=None, learning_rate=0.05, max_bin=None,\n              max_cat_threshold=None, max_cat_to_onehot=None,\n              max_delta_step=None, max_depth=5, max_leaves=None,\n              min_child_weight=None, missing=nan, monotone_constraints=None,\n              multi_strategy=None, n_estimators=100, n_jobs=None,\n              num_parallel_tree=None, random_state=42, ...)</pre></div> </div></div></div></div></div></div>\n```\n:::\n:::\n\n\nLet's evaluate the model on the test data using Gini index:\n\n::: {#123379e8 .cell execution_count=26}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to calculate the Gini score for the test dataset\"}\ny_pred_proba = pipe.predict_proba(X_test)[:, 1]\n# calculate the AUC score\nauc = roc_auc_score(y_test, y_pred_proba)\n\n# calculate the Gini score\ngini_score = 2 * auc - 1\nprint(f\"gini_score for test dataset: {gini_score:,}\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\ngini_score for test dataset: 0.07954028892065734\n```\n:::\n:::\n\n\n::: {.callout-note}\nThe competition is evaluated using a Gini stability metric. For more information, see the\n[evaluation guidelines](https://www.kaggle.com/competitions/home-credit-credit-risk-model-stability/overview/evaluation)\n:::\n\n### Neural network classifier\nBuild a neural network classifier using PyTorch and PyTorch Lightning.\n\n::: {.callout-warning}\nIt is not recommended to build a neural network classifier for this competition, we are building\nit solely for demonstration purposes.\n:::\n\nWe'll demonstrate how to build a model by directly passing the data to it. IbisML recipes can output\ndata in various formats, making it compatible with different modeling frameworks.\nLet's first train the recipe:\n\n::: {#6e7a02ad .cell execution_count=27}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to train the IbisML recipe\"}\n# train preprocessing recipe using training dataset\nlast_mile_preprocessing.fit(X_train, y_train)\n```\n\n::: {.cell-output .cell-output-display execution_count=58}\n```\nRecipe(ExpandDate(date(), components=['week', 'day']),\n       Drop(date()),\n       OneHotEncode(cols(('maritalst_893M', 'requesttype_4525192L', 'max_profession_152M', 'max_gender_992L', 'max_empl_industry_691L', 'max_housingtype_772L', 'max_incometype_1044T', 'max_cancelreason_3545846M', 'max_rejectreason_755M', 'education_1103M', 'max_status_219L'))),\n       Drop(string()),\n       HandleUnivariateOutliers(cols(('max_amount_1115A', 'max_overdueamountmax_950A')),\n                                method='z-score',\n                                deviation_factor=3,\n                                treatment='capping'),\n       ImputeMedian(numeric()),\n       ScaleMinMax(numeric()),\n       FillNA(numeric(), 0),\n       Cast(numeric(), 'float32'))\n```\n:::\n:::\n\n\nIn the previous cell, we trained the recipe using the training dataset. Now, we will\ntransform both the train and test datasets using the same recipe. The default output format is a `NumPy array`\n\n::: {#ac5aaa87 .cell execution_count=28}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to transform the datasets using fitted recipe\"}\n# transform train and test dataset using IbisML recipe\nX_train_transformed = last_mile_preprocessing.transform(X_train)\nX_test_transformed = last_mile_preprocessing.transform(X_test)\nprint(f\"train data shape = {X_train_transformed.shape}\")\nprint(f\"test data shape = {X_test_transformed.shape}\")\n```\n\n::: {.cell-output .cell-output-stdout}\n```\ntrain data shape = (1144339, 977)\ntest data shape = (382320, 977)\n```\n:::\n:::\n\n\nLet's define a neural network classifier using PyTorch and PyTorch Lighting:\n\n::: {#92200558 .cell execution_count=29}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to define a torch classifier\"}\nimport numpy as np\nimport torch\nimport torch.nn as nn\nimport torch.optim as optim\nfrom torch.utils.data import DataLoader, TensorDataset\nimport pytorch_lightning as pl\nfrom pytorch_lightning import Trainer\n\n\nclass NeuralNetClassifier(pl.LightningModule):\n    def __init__(self, input_dim, hidden_dim=8, output_dim=1):\n        super().__init__()\n        self.model = nn.Sequential(\n            nn.Linear(input_dim, hidden_dim),\n            nn.ReLU(),\n            nn.Linear(hidden_dim, output_dim),\n        )\n        self.loss = nn.BCEWithLogitsLoss()\n        self.sigmoid = nn.Sigmoid()\n\n    def forward(self, x):\n        return self.model(x)\n\n    def training_step(self, batch, batch_idx):\n        x, y = batch\n        y_hat = self(x)\n        loss = self.loss(y_hat.view(-1), y)\n        self.log(\"train_loss\", loss)\n        return loss\n\n    def validation_step(self, batch, batch_idx):\n        x, y = batch\n        y_hat = self(x)\n        loss = self.loss(y_hat.view(-1), y)\n        self.log(\"val_loss\", loss)\n        return loss\n\n    def configure_optimizers(self):\n        return optim.Adam(self.parameters(), lr=0.001)\n\n    def predict_proba(self, x):\n        self.eval()\n        with torch.no_grad():\n            x = x.to(self.device)\n            return self.sigmoid(self(x))\n\n# initialize your Lightning Module\nnn_classifier = NeuralNetClassifier(input_dim=X_train_transformed.shape[1])\n```\n:::\n\n\nNow, we'll create the PyTorch DataLoader using the output from IbisML:\n\n::: {#76fda8db .cell execution_count=30}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to convert IbisML output to tensor\"}\ny_train_array = y_train.to_pandas().to_numpy().astype(np.float32)\nx_train_tensor = torch.from_numpy(X_train_transformed)\ny_train_tensor = torch.from_numpy(y_train_array)\ntrain_dataset = TensorDataset(x_train_tensor, y_train_tensor)\n\ny_test_array = y_test.to_pandas().to_numpy().astype(np.float32)\nX_test_tensor = torch.from_numpy(X_test_transformed)\ny_test_tensor = torch.from_numpy(y_test_array)\nval_dataset = TensorDataset(X_test_tensor, y_test_tensor)\n\ntrain_loader = DataLoader(train_dataset, batch_size=32, shuffle=False)\nval_loader = DataLoader(val_dataset, batch_size=32, shuffle=False)\n```\n:::\n\n\nInitialize the PyTorch Lightning Trainer:\n\n::: {#4331ae3c .cell execution_count=31}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to construct PyTorch Lightning Trainer\"}\n# initialize a Trainer\ntrainer = Trainer(max_epochs=2)\nprint(nn_classifier)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\nNeuralNetClassifier(\n  (model): Sequential(\n    (0): Linear(in_features=977, out_features=8, bias=True)\n    (1): ReLU()\n    (2): Linear(in_features=8, out_features=1, bias=True)\n  )\n  (loss): BCEWithLogitsLoss()\n  (sigmoid): Sigmoid()\n)\n```\n:::\n:::\n\n\nLet's train the classifier:\n\n::: {#fbfc82b9 .cell execution_count=32}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to train the pytorch classifier\"}\n# train the model\ntrainer.fit(nn_classifier, train_loader, val_loader)\n```\n\n::: {.cell-output .cell-output-display}\n```{=html}\n<script type=\"application/vnd.jupyter.widget-view+json\">\n{\"model_id\":\"a8835524dc5744d083f8bc99d0d1ee21\",\"version_major\":2,\"version_minor\":0,\"quarto_mimetype\":\"application/vnd.jupyter.widget-view+json\"}\n</script>\n```\n:::\n\n::: {.cell-output .cell-output-display}\n```{=html}\n<script type=\"application/vnd.jupyter.widget-view+json\">\n{\"model_id\":\"95c1d4d51e35415aacad0b0c48391169\",\"version_major\":2,\"version_minor\":0,\"quarto_mimetype\":\"application/vnd.jupyter.widget-view+json\"}\n</script>\n```\n:::\n\n::: {.cell-output .cell-output-display}\n```{=html}\n<script type=\"application/vnd.jupyter.widget-view+json\">\n{\"model_id\":\"80f23800f5224bb6b0bafa9b3f11c09f\",\"version_major\":2,\"version_minor\":0,\"quarto_mimetype\":\"application/vnd.jupyter.widget-view+json\"}\n</script>\n```\n:::\n\n::: {.cell-output .cell-output-display}\n```{=html}\n<script type=\"application/vnd.jupyter.widget-view+json\">\n{\"model_id\":\"8efd0507f1144c1e9a76770adc9c1dbd\",\"version_major\":2,\"version_minor\":0,\"quarto_mimetype\":\"application/vnd.jupyter.widget-view+json\"}\n</script>\n```\n:::\n:::\n\n\nLet's use the trained model to make a prediction:\n\n::: {#004ce1cf .cell execution_count=33}\n``` {.python .cell-code code-fold=\"true\" code-summary=\"Show code to predict using the trained pytorch classifier\"}\ny_pred = nn_classifier.predict_proba(X_test_tensor[:10])\ny_pred\n```\n\n::: {.cell-output .cell-output-display execution_count=64}\n```\ntensor([[0.0238],\n        [0.0253],\n        [0.0210],\n        [0.0254],\n        [0.0236],\n        [0.0239],\n        [0.0242],\n        [0.0240],\n        [0.0169],\n        [0.0236]])\n```\n:::\n:::\n\n\n## Takeaways\nIbisML provides a powerful suite of last-mile preprocessing transformations, including an advanced column selector\nthat streamlines the selection and transformation of specific columns in your dataset.\n\nIt integrates seamlessly with scikit-learn pipelines, allowing you to incorporate preprocessing recipes directly into\nyour workflow. Additionally, IbisML supports a variety of data output formats such as Dask, NumPy, and Arrow, ensuring\ncompatibility with different machine learning frameworks.\n\nAnother key advantage of IbisML is its flexibility in performing data preprocessing across multiple backends, including\nDuckDB, Polars, Spark, BigQuery, and other Ibis backends. This enables you to preprocess your training data\nusing the backend that best suits your needs, whether for large or small datasets, on local machines or compute backends,\nand in both development and production environments. Stay tuned for a future post where we will explore this capability in\nmore detail.\n\n## Reference\n* [1st Place Solution](https://www.kaggle.com/code/yuuniekiri/fork-of-home-credit-risk-lightgbm)\n* [home-credit-2024-starter-notebook](https://www.kaggle.com/code/jetakow/home-credit-2024-starter-notebook)\n* [EDA and Submission](https://www.kaggle.com/competitions/home-credit-credit-risk-model-stability/discussion/508337)\n* [Home Credit Baseline](https://www.kaggle.com/code/greysky/home-credit-baseline)\n\n",
+    "supporting": [
+      "index_files"
+    ],
+    "filters": [],
+    "includes": {
+      "include-in-header": [
+        "<script src=\"https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js\" integrity=\"sha512-c3Nl8+7g4LMSTdrm621y7kf9v3SDPnhxLNhcjFJbKECVnmZHTdo+IRO05sNLTH/D3vA6u1X32ehoLC7WFVdheg==\" crossorigin=\"anonymous\"></script>\n<script src=\"https://cdnjs.cloudflare.com/ajax/libs/jquery/3.5.1/jquery.min.js\" integrity=\"sha512-bLT0Qm9VnAYZDflyKcBaQ2gg0hSYNQrJ8RilYldYQ1FxQYoCLtUjuuRuZo+fjqhx/qtq/1itJ0C2ejDxltZVFg==\" crossorigin=\"anonymous\" data-relocate-top=\"true\"></script>\n<script type=\"application/javascript\">define('jquery', [],function() {return window.jQuery;})</script>\n<script src=\"https://unpkg.com/@jupyter-widgets/html-manager@*/dist/embed-amd.js\" crossorigin=\"anonymous\"></script>\n"
+      ],
+      "include-after-body": [
+        "<script type=application/vnd.jupyter.widget-state+json>\n{\"state\":{\"1384ea71de0c4fc78f25761d9c5b7ee1\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"HTMLView\",\"description\":\"\",\"description_allow_html\":false,\"layout\":\"IPY_MODEL_13a0d4e7e26c468d8f9107c900820c70\",\"placeholder\":\"​\",\"style\":\"IPY_MODEL_920abe1c134c4dbab840441a8bfe3265\",\"tabbable\":null,\"tooltip\":null,\"value\":\"Validation DataLoader 0: 100%\"}},\"13a0d4e7e26c468d8f9107c900820c70\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":null,\"flex\":null,\"flex_flow\":null,\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":null,\"width\":null}},\"16b195af8b7c4539ba0e7b066b441c1c\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":null,\"flex\":null,\"flex_flow\":null,\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":null,\"width\":null}},\"187e84d347224f76947c6f8ed744deac\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"ProgressStyleModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"ProgressStyleModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"StyleView\",\"bar_color\":null,\"description_width\":\"\"}},\"1f8efa5b4812447a86868f33db8c411b\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"HTMLView\",\"description\":\"\",\"description_allow_html\":false,\"layout\":\"IPY_MODEL_bdfab51ef12948fab297a6bd5b11fbe7\",\"placeholder\":\"​\",\"style\":\"IPY_MODEL_b22e26063347472097b120f2a576530d\",\"tabbable\":null,\"tooltip\":null,\"value\":\" 2/2 [00:00&lt;00:00, 24.50it/s]\"}},\"24b2c88c9b9c481e9cb10d45e99350c2\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"HTMLView\",\"description\":\"\",\"description_allow_html\":false,\"layout\":\"IPY_MODEL_c243290c92c64397b84da341aa2b9e12\",\"placeholder\":\"​\",\"style\":\"IPY_MODEL_72884c2a78a74be2b2f6c70c5d38dc3e\",\"tabbable\":null,\"tooltip\":null,\"value\":\"Validation DataLoader 0: 100%\"}},\"2999a904dd5e42dea6713255824f3a7b\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"FloatProgressModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"FloatProgressModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"ProgressView\",\"bar_style\":\"\",\"description\":\"\",\"description_allow_html\":false,\"layout\":\"IPY_MODEL_b79706ee40c1457a84fae0fa22b355c1\",\"max\":2,\"min\":0,\"orientation\":\"horizontal\",\"style\":\"IPY_MODEL_3053f083f2fc4935bd06322273e19559\",\"tabbable\":null,\"tooltip\":null,\"value\":2}},\"2f395c103dc1484e9bf92ba38af98856\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"ProgressStyleModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"ProgressStyleModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"StyleView\",\"bar_color\":null,\"description_width\":\"\"}},\"3053f083f2fc4935bd06322273e19559\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"ProgressStyleModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"ProgressStyleModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"StyleView\",\"bar_color\":null,\"description_width\":\"\"}},\"30f16c0c172e4188b8187533a1a6f328\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"HTMLView\",\"description\":\"\",\"description_allow_html\":false,\"layout\":\"IPY_MODEL_16b195af8b7c4539ba0e7b066b441c1c\",\"placeholder\":\"​\",\"style\":\"IPY_MODEL_5332a7912a424e14a044f533d52f3027\",\"tabbable\":null,\"tooltip\":null,\"value\":\" 35761/35761 [05:50&lt;00:00, 101.93it/s, v_num=22]\"}},\"31662d5c8abc401d84d8e0286ae134db\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":null,\"flex\":null,\"flex_flow\":null,\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":null,\"width\":null}},\"3a3995a15b234252acd1904c424edfbf\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":null,\"flex\":null,\"flex_flow\":null,\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":null,\"width\":null}},\"47caeeeda211459b9a35b9bdbcca8208\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"HTMLView\",\"description\":\"\",\"description_allow_html\":false,\"layout\":\"IPY_MODEL_a4cf5e12c5514c98bead6857025bd4c8\",\"placeholder\":\"​\",\"style\":\"IPY_MODEL_767889a1066b4f8bb545c1f6c7b3aedd\",\"tabbable\":null,\"tooltip\":null,\"value\":\"Sanity Checking DataLoader 0: 100%\"}},\"5332a7912a424e14a044f533d52f3027\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLStyleModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLStyleModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"StyleView\",\"background\":null,\"description_width\":\"\",\"font_size\":null,\"text_color\":null}},\"59b891ecf3504578907bbb8385a108b3\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":\"inline-flex\",\"flex\":null,\"flex_flow\":\"row wrap\",\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":\"hidden\",\"width\":\"100%\"}},\"5de4313389394cc0951474da82045311\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":null,\"flex\":null,\"flex_flow\":null,\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":null,\"width\":null}},\"6c28b18916d443cba9db0821d843e554\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"FloatProgressModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"FloatProgressModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"ProgressView\",\"bar_style\":\"\",\"description\":\"\",\"description_allow_html\":false,\"layout\":\"IPY_MODEL_b3ccd1f2a683402c98c7771a6360c35f\",\"max\":11948,\"min\":0,\"orientation\":\"horizontal\",\"style\":\"IPY_MODEL_187e84d347224f76947c6f8ed744deac\",\"tabbable\":null,\"tooltip\":null,\"value\":11948}},\"6e9435f4b8974e3e81630a84797ba4f2\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLStyleModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLStyleModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"StyleView\",\"background\":null,\"description_width\":\"\",\"font_size\":null,\"text_color\":null}},\"72884c2a78a74be2b2f6c70c5d38dc3e\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLStyleModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLStyleModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"StyleView\",\"background\":null,\"description_width\":\"\",\"font_size\":null,\"text_color\":null}},\"76577cbc4aa942898c1a6b58442152a3\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"ProgressStyleModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"ProgressStyleModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"StyleView\",\"bar_color\":null,\"description_width\":\"\"}},\"767889a1066b4f8bb545c1f6c7b3aedd\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLStyleModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLStyleModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"StyleView\",\"background\":null,\"description_width\":\"\",\"font_size\":null,\"text_color\":null}},\"7dec257bf8944f1290bb997ebb48708f\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":\"inline-flex\",\"flex\":null,\"flex_flow\":\"row wrap\",\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":\"hidden\",\"width\":\"100%\"}},\"80f23800f5224bb6b0bafa9b3f11c09f\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HBoxModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HBoxModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"HBoxView\",\"box_style\":\"\",\"children\":[\"IPY_MODEL_24b2c88c9b9c481e9cb10d45e99350c2\",\"IPY_MODEL_e88d442e82ea47dfbfc90507ea96a5b5\",\"IPY_MODEL_e061e09bb90d4d72af041d77b51e10b7\"],\"layout\":\"IPY_MODEL_c588bb534d4245dd9c6bc6affa135e8c\",\"tabbable\":null,\"tooltip\":null}},\"89928e5649d64658aa25e6041201d52d\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":null,\"flex\":\"2\",\"flex_flow\":null,\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":null,\"width\":null}},\"8d694f8516bc4a23a04697acda422a5c\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"FloatProgressModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"FloatProgressModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"ProgressView\",\"bar_style\":\"success\",\"description\":\"\",\"description_allow_html\":false,\"layout\":\"IPY_MODEL_89928e5649d64658aa25e6041201d52d\",\"max\":35761,\"min\":0,\"orientation\":\"horizontal\",\"style\":\"IPY_MODEL_76577cbc4aa942898c1a6b58442152a3\",\"tabbable\":null,\"tooltip\":null,\"value\":35761}},\"8efd0507f1144c1e9a76770adc9c1dbd\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HBoxModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HBoxModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"HBoxView\",\"box_style\":\"\",\"children\":[\"IPY_MODEL_1384ea71de0c4fc78f25761d9c5b7ee1\",\"IPY_MODEL_6c28b18916d443cba9db0821d843e554\",\"IPY_MODEL_cb5c990dbc9f46bd8bfadf27815223af\"],\"layout\":\"IPY_MODEL_7dec257bf8944f1290bb997ebb48708f\",\"tabbable\":null,\"tooltip\":null}},\"920abe1c134c4dbab840441a8bfe3265\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLStyleModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLStyleModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"StyleView\",\"background\":null,\"description_width\":\"\",\"font_size\":null,\"text_color\":null}},\"95c1d4d51e35415aacad0b0c48391169\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HBoxModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HBoxModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"HBoxView\",\"box_style\":\"\",\"children\":[\"IPY_MODEL_976bdfb185234c2195db6ab6a6eb604f\",\"IPY_MODEL_8d694f8516bc4a23a04697acda422a5c\",\"IPY_MODEL_30f16c0c172e4188b8187533a1a6f328\"],\"layout\":\"IPY_MODEL_f99ff86f76f3480b997b3c290772226c\",\"tabbable\":null,\"tooltip\":null}},\"976bdfb185234c2195db6ab6a6eb604f\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"HTMLView\",\"description\":\"\",\"description_allow_html\":false,\"layout\":\"IPY_MODEL_3a3995a15b234252acd1904c424edfbf\",\"placeholder\":\"​\",\"style\":\"IPY_MODEL_fcc8f592d94c4ea9b37ea5f79ea237cc\",\"tabbable\":null,\"tooltip\":null,\"value\":\"Epoch 1: 100%\"}},\"9f3542eb33e44611912ad34092ce3957\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLStyleModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLStyleModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"StyleView\",\"background\":null,\"description_width\":\"\",\"font_size\":null,\"text_color\":null}},\"a4cf5e12c5514c98bead6857025bd4c8\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":null,\"flex\":null,\"flex_flow\":null,\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":null,\"width\":null}},\"a8835524dc5744d083f8bc99d0d1ee21\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HBoxModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HBoxModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"HBoxView\",\"box_style\":\"\",\"children\":[\"IPY_MODEL_47caeeeda211459b9a35b9bdbcca8208\",\"IPY_MODEL_2999a904dd5e42dea6713255824f3a7b\",\"IPY_MODEL_1f8efa5b4812447a86868f33db8c411b\"],\"layout\":\"IPY_MODEL_59b891ecf3504578907bbb8385a108b3\",\"tabbable\":null,\"tooltip\":null}},\"b22e26063347472097b120f2a576530d\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLStyleModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLStyleModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"StyleView\",\"background\":null,\"description_width\":\"\",\"font_size\":null,\"text_color\":null}},\"b3ccd1f2a683402c98c7771a6360c35f\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":null,\"flex\":\"2\",\"flex_flow\":null,\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":null,\"width\":null}},\"b79706ee40c1457a84fae0fa22b355c1\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":null,\"flex\":\"2\",\"flex_flow\":null,\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":null,\"width\":null}},\"bdfab51ef12948fab297a6bd5b11fbe7\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":null,\"flex\":null,\"flex_flow\":null,\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":null,\"width\":null}},\"c243290c92c64397b84da341aa2b9e12\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":null,\"flex\":null,\"flex_flow\":null,\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":null,\"width\":null}},\"c588bb534d4245dd9c6bc6affa135e8c\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":\"inline-flex\",\"flex\":null,\"flex_flow\":\"row wrap\",\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":\"hidden\",\"width\":\"100%\"}},\"cb5c990dbc9f46bd8bfadf27815223af\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"HTMLView\",\"description\":\"\",\"description_allow_html\":false,\"layout\":\"IPY_MODEL_31662d5c8abc401d84d8e0286ae134db\",\"placeholder\":\"​\",\"style\":\"IPY_MODEL_9f3542eb33e44611912ad34092ce3957\",\"tabbable\":null,\"tooltip\":null,\"value\":\" 11948/11948 [00:29&lt;00:00, 400.23it/s]\"}},\"cd0d35abd33f4e88886c2664f7e9796d\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":null,\"flex\":\"2\",\"flex_flow\":null,\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":null,\"width\":null}},\"e061e09bb90d4d72af041d77b51e10b7\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"HTMLView\",\"description\":\"\",\"description_allow_html\":false,\"layout\":\"IPY_MODEL_5de4313389394cc0951474da82045311\",\"placeholder\":\"​\",\"style\":\"IPY_MODEL_6e9435f4b8974e3e81630a84797ba4f2\",\"tabbable\":null,\"tooltip\":null,\"value\":\" 11948/11948 [00:29&lt;00:00, 409.22it/s]\"}},\"e88d442e82ea47dfbfc90507ea96a5b5\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"FloatProgressModel\",\"state\":{\"_dom_classes\":[],\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"FloatProgressModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/controls\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"ProgressView\",\"bar_style\":\"\",\"description\":\"\",\"description_allow_html\":false,\"layout\":\"IPY_MODEL_cd0d35abd33f4e88886c2664f7e9796d\",\"max\":11948,\"min\":0,\"orientation\":\"horizontal\",\"style\":\"IPY_MODEL_2f395c103dc1484e9bf92ba38af98856\",\"tabbable\":null,\"tooltip\":null,\"value\":11948}},\"f99ff86f76f3480b997b3c290772226c\":{\"model_module\":\"@jupyter-widgets/base\",\"model_module_version\":\"2.0.0\",\"model_name\":\"LayoutModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/base\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"LayoutModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"LayoutView\",\"align_content\":null,\"align_items\":null,\"align_self\":null,\"border_bottom\":null,\"border_left\":null,\"border_right\":null,\"border_top\":null,\"bottom\":null,\"display\":\"inline-flex\",\"flex\":null,\"flex_flow\":\"row wrap\",\"grid_area\":null,\"grid_auto_columns\":null,\"grid_auto_flow\":null,\"grid_auto_rows\":null,\"grid_column\":null,\"grid_gap\":null,\"grid_row\":null,\"grid_template_areas\":null,\"grid_template_columns\":null,\"grid_template_rows\":null,\"height\":null,\"justify_content\":null,\"justify_items\":null,\"left\":null,\"margin\":null,\"max_height\":null,\"max_width\":null,\"min_height\":null,\"min_width\":null,\"object_fit\":null,\"object_position\":null,\"order\":null,\"overflow\":null,\"padding\":null,\"right\":null,\"top\":null,\"visibility\":null,\"width\":\"100%\"}},\"fcc8f592d94c4ea9b37ea5f79ea237cc\":{\"model_module\":\"@jupyter-widgets/controls\",\"model_module_version\":\"2.0.0\",\"model_name\":\"HTMLStyleModel\",\"state\":{\"_model_module\":\"@jupyter-widgets/controls\",\"_model_module_version\":\"2.0.0\",\"_model_name\":\"HTMLStyleModel\",\"_view_count\":null,\"_view_module\":\"@jupyter-widgets/base\",\"_view_module_version\":\"2.0.0\",\"_view_name\":\"StyleView\",\"background\":null,\"description_width\":\"\",\"font_size\":null,\"text_color\":null}}},\"version_major\":2,\"version_minor\":0}\n</script>\n"
+      ]
+    }
+  }
+}
\ No newline at end of file
diff --git a/docs/posts/ibisml/index.qmd b/docs/posts/ibisml/index.qmd
new file mode 100644
index 000000000000..9a22f5551750
--- /dev/null
+++ b/docs/posts/ibisml/index.qmd
@@ -0,0 +1,928 @@
+---
+title: "Using IbisML and DuckDB for a Kaggle competition: credit risk model stability"
+author: "Jiting Xu"
+date: "2024-08-22"
+categories:
+    - blog
+    - duckdb
+    - machine learning
+    - feature engineering
+---
+
+## Introduction
+In this post, we'll demonstrate how to use Ibis and [IbisML](https://github.com/ibis-project/ibis-ml)
+end-to-end for the
+[credit risk model stability Kaggle competition](https://www.kaggle.com/competitions/home-credit-credit-risk-model-stability).
+
+1. Load data and perform feature engineering on DuckDB backend using IbisML
+2. Perform last-mile ML data preprocessing on DuckDB backend using IbisML
+3. Train two models using different frameworks:
+    * An XGBoost model within a scikit-learn pipeline.
+    * A neural network with PyTorch and PyTorch Lightning.
+
+The aim of this competition is to predict which clients are more likely to default on their
+loans by using both internal and external data sources.
+
+To get started with Ibis and IbisML, please refer to the websites:
+
+* [Ibis](https://ibis-project.org/): An open-source dataframe library that works with any data system.
+* [IbisML](https://ibis-project.github.io/ibis-ml/): A library for building scalable ML pipelines.
+
+
+## Prerequisites
+To run this example, you'll need to download the data from Kaggle website with a Kaggle user account and install Ibis, IbisML, and the necessary modeling library.
+
+### Download data
+You need a Kaggle account to download the data. If you do not have one,
+feel free to register one.
+
+1. Option 1: Manual download
+     * Log into your Kaggle account and download all data from this
+     [link](https://www.kaggle.com/competitions/home-credit-credit-risk-model-stability/data),
+     unzip the files, and save them to your local disk.
+2. Option 2: Kaggle API
+    * Go to your `Kaggle Account Settings`.
+    * Under the `API` section, click on `Create New API Token`. This will download the `kaggle.json`
+    file to your computer.
+    * Place the `kaggle.json` file in the correct directory, normally it is under your home directory
+    `~/.kaggle`:
+
+        ```bash
+        mkdir ~/.kaggle
+        mv ~/Downloads/kaggle.json ~/.kaggle
+        ```
+    * Install Kaggle CLI and download the data:
+
+        ```bash
+        pip install kaggle
+        kaggle competitions download -c home-credit-credit-risk-model-stability
+        unzip home-credit-credit-risk-model-stability.zip
+        ```
+
+### Install libraries
+To use Ibis and IbisML with the DuckDB backend for building models, you'll need to install the
+necessary packages. Depending on your preferred machine learning framework, you can choose
+one of the following installation commands:
+
+For PyTorch-based models:
+
+```{.bash}
+pip install 'ibis-framework[duckdb]' ibis-ml torch pytorch-lightning
+```
+
+For XGBoost and scikit-learn-based models:
+
+```{.bash}
+pip install 'ibis-framework[duckdb]' ibis-ml xgboost[scikit-learn]
+```
+
+Import libraries:
+```{python}
+import ibis
+import ibis.expr.datatypes as dt
+from ibis import _
+import ibis_ml as ml
+from pathlib import Path
+from glob import glob
+
+# enable interactive mode for ibis
+ibis.options.interactive = True
+```
+
+Set the backend for computing:
+```{python}
+con = ibis.duckdb.connect()
+# remove the black bars from duckdb's progress bar
+con.raw_sql("set enable_progress_bar = false")
+# DuckDB is the default backend for Ibis
+ibis.set_backend(con)
+```
+
+Set data path:
+```{python}
+# change the root path to yours
+ROOT = Path("/Users/claypot/Downloads/home-credit-credit-risk-model-stability")
+TRAIN_DIR = ROOT / "parquet_files" / "train"
+TEST_DIR = ROOT / "parquet_files" / "test"
+```
+
+## Data loading and processing
+We'll use Ibis to read the Parquet files and perform the necessary processing for the next step.
+
+### Directory structure and tables
+Since there are many data files, let's start by examining the directory structure and
+tables within the train directory:
+
+```bash
+# change this to your directory
+tree -L 2 ~/Downloads/home-credit-credit-risk-model-stability/parquet_files/train
+```
+
+:::{.callout-note title="Click to show data files" collapse="true"}
+
+```bash
+~/Downloads/home-credit-credit-risk-model-stability/parquet_files/train
+├── train_applprev_1_0.parquet
+├── train_applprev_1_1.parquet
+├── train_applprev_2.parquet
+├── train_base.parquet
+├── train_credit_bureau_a_1_0.parquet
+├── train_credit_bureau_a_1_1.parquet
+├── train_credit_bureau_a_1_3.parquet
+├── train_credit_bureau_a_2_0.parquet
+├── train_credit_bureau_a_2_1.parquet
+├── train_credit_bureau_a_2_10.parquet
+├── train_credit_bureau_a_2_2.parquet
+├── train_credit_bureau_a_2_3.parquet
+├── train_credit_bureau_a_2_4.parquet
+├── train_credit_bureau_a_2_5.parquet
+├── train_credit_bureau_a_2_6.parquet
+├── train_credit_bureau_a_2_7.parquet
+├── train_credit_bureau_a_2_8.parquet
+├── train_credit_bureau_a_2_9.parquet
+├── train_credit_bureau_b_1.parquet
+├── train_credit_bureau_b_2.parquet
+├── train_debitcard_1.parquet
+├── train_deposit_1.parquet
+├── train_other_1.parquet
+├── train_person_1.parquet
+├── train_person_2.parquet
+├── train_static_0_0.parquet
+├── train_static_0_1.parquet
+├── train_static_cb_0.parquet
+├── train_tax_registry_a_1.parquet
+├── train_tax_registry_b_1.parquet
+└── train_tax_registry_c_1.parquet
+```
+
+:::
+
+The `train_base.parquet` file is the base table, while the others are feature tables.
+Let's take a quick look at these tables.
+
+#### Base table
+The base table (`train_base.parquet`) contains the unique ID, a binary target flag
+and other information for the training samples. This unique ID will serve as the
+linking key for joining with other feature tables.
+
+* `case_id` - This is the unique ID for each loan. You'll need this ID to
+  join feature tables to the base table. There are about 1.5m unique loans.
+* `date_decision` - This refers to the date when a decision was made regarding the
+  approval of the loan.
+* `WEEK_NUM` - This is the week number used for aggregation. In the test sample,
+    `WEEK_NUM` continues sequentially from the last training value of `WEEK_NUM`.
+* `MONTH` - This column represents the month when the approval decision was made.
+* `target` - This is the binary target flag, determined after a certain period based on
+  whether or not the client defaulted on the specific loan.
+
+Here is several examples from the base table:
+
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to get the top 5 rows of base table"
+ibis.read_parquet(TRAIN_DIR / "train_base.parquet").head(5)
+```
+
+#### Feature tables
+The remaining files contain features, consisting of approximately 370 features from
+previous loan applications and external data sources. Their definitions can be found in the feature
+definition [file](https://www.kaggle.com/competitions/home-credit-credit-risk-model-stability/data)
+from the competition website.
+
+There are several things we want to mention for the feature tables:
+
+* **Union datasets**: One dataset could be saved into multiple parquet files, such as
+`train_applprev_1_0.parquet` and `train_applprev_1_1.parquet`, We need to union this data.
+* **Dataset levels**: Datasets may have different levels, which we will explain as
+follows:
+     * **Depth = 0**: Each row in the table is identified by a unique `case_id`.
+     In this case, you can directly join the features with the base table and use them as
+     features for further analysis or processing.
+     * **Depth > 0**:  You will group the data based on the `case_id` and perform calculations
+     or aggregations within each group.
+
+Here are two examples of tables with different levels.
+
+Example of table with depth = 0, `case_id` is the row identifier, features can be directly joined
+ with the base table.
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to get the top 5 rows of user static data"
+ibis.read_parquet(TRAIN_DIR / "train_static_cb_0.parquet").head(5)
+```
+
+Example of a table with depth = 1, we need to aggregate the features and collect statistics
+based on `case_id` then join with the base table.
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to get the top 5 rows of credit bureau data"
+ibis.read_parquet(TRAIN_DIR / "train_credit_bureau_b_1.parquet").relocate(
+    "num_group1"
+).order_by(["case_id", "num_group1"]).head(5)
+```
+
+For more details on features and its exploratory data analysis (EDA), you can refer to
+feature definition and these Kaggle notebooks:
+
+* [Feature
+  definition](https://www.kaggle.com/competitions/home-credit-credit-risk-model-stability/data#:~:text=calendar_view_week-,feature_definitions,-.csv)
+* [Home credit risk prediction
+  EDA](https://www.kaggle.com/code/loki97/home-credit-risk-prediction-eda)
+* [Home credit CRMS 2024
+  EDA](https://www.kaggle.com/code/sergiosaharovskiy/home-credit-crms-2024-eda-and-submission)
+
+### Data loading and processing
+We will perform the following data processing steps using Ibis and IbisML:
+
+* **Convert data types**: Ensure consistency by converting data types, as the same column
+  in different sub-files may have different types.
+* **Aggregate features**: For tables with depth greater than 0, aggregate features based
+  on `case_id`, including statistics calculation. You can collect statistics such as mean,
+  median, mode, minimum, standard deviation, and others.
+* **Union and join datasets**: Combine multiple sub-files of the same dataset into one
+  table, as some datasets are split into multiple sub-files with a common prefix. Afterward,
+  join these tables with the base table.
+
+#### Convert data types
+We'll use IbisML to create a chain of `Cast` steps, forming a recipe for data type
+conversion across the dataset. This conversion is based on the provided information
+extracted from column names. Columns that have similar transformations are indicated by a
+capital letter at the end of their names:
+
+* P - Transform DPD (Days past due)
+* M - Masking categories
+* A - Transform amount
+* D - Transform date
+* T - Unspecified Transform
+* L - Unspecified Transform
+
+For example, we'll define a IbisML transformation step to convert columns ends with `P`
+to floating number:
+```{python}
+# convert columns ends with P to floating number
+step_cast_P_to_float = ml.Cast(ml.endswith("P"), dt.float64)
+```
+
+Next, let's define additional type conversion transformations based on the postfix of column names:
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to define more steps"
+# convert columns ends with A to floating number
+step_cast_A_to_float = ml.Cast(ml.endswith("A"), dt.float64)
+# convert columns ends with D to date
+step_cast_D_to_date = ml.Cast(ml.endswith("D"), dt.date)
+# convert columns ends with M to str
+step_cast_M_to_str = ml.Cast(ml.endswith("M"), dt.str)
+```
+
+We'll construct the
+[IbisML Recipe](https://ibis-project.github.io/ibis-ml/reference/core.html#ibis_ml.Recipe)
+which chains together all the transformation steps.
+
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to construct the recipe"
+data_type_recipes = ml.Recipe(
+    step_cast_P_to_float,
+    step_cast_D_to_date,
+    step_cast_M_to_str,
+    step_cast_A_to_float,
+    # cast some special columns
+    ml.Cast(["date_decision"], "date"),
+    ml.Cast(["case_id", "WEEK_NUM", "num_group1", "num_group2"], dt.int64),
+    ml.Cast(
+        [
+            "cardtype_51L",
+            "credacc_status_367L",
+            "requesttype_4525192L",
+            "riskassesment_302T",
+            "max_periodicityofpmts_997L",
+        ],
+        dt.str,
+    ),
+    ml.Cast(
+        [
+            "isbidproductrequest_292L",
+            "isdebitcard_527L",
+            "equalityempfrom_62L",
+        ],
+        dt.int64,
+    ),
+)
+print(f"Data format conversion recipe:\n{data_type_recipes}")
+```
+
+::: {.callout-tip}
+IbisML offers a powerful set of column selectors, allowing you to select columns based
+on names, types, and patterns. For more information, you can refer to the IbisML column
+selectors [documentation](https://ibis-project.github.io/ibis-ml/reference/selectors.html).
+:::
+
+#### Aggregate features
+For tables with a depth greater than 0 that can't be directly joined with the base table,
+we need to aggregate the features by the `case_id`. You could compute the different statistics for numeric columns and
+non-numeric columns.
+
+Here, we use the `maximum` as an example.
+
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to aggregate features by case_id using max"
+def agg_by_id(table):
+    return table.group_by("case_id").agg(
+        [
+            table[col_name].max().name(f"max_{col_name}")
+            for col_name in table.columns
+            if col_name[-1] in ("T", "L", "P", "A", "D", "M")
+        ]
+    )
+```
+::: {.callout-tip}
+For better predicting power, you need to collect different statistics based on the meaning of features. For simplicity,
+we'll only collect the maximum value of the features here.
+:::
+
+#### Put them together
+We'll put them together in a function reads parquet files, optionally handles regex patterns for
+multiple sub-files, applies data type transformations defined by `data_type_recipes`, and
+performs aggregation based on `case_id` if specified by the depth parameter.
+
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to read and process data files"
+def read_and_process_files(file_path, depth=None, is_regex=False):
+    """
+    Read and process Parquet files.
+
+    Args:
+        file_path (str): Path to the file or regex pattern to match files.
+        depth (int, optional): Depth of processing. If 1 or 2, additional aggregation is performed.
+        is_regex (bool, optional): Whether the file_path is a regex pattern.
+
+    Returns:
+        ibis.Table: The processed Ibis table.
+    """
+    if is_regex:
+        # read and union multiple files
+        chunks = []
+        for path in glob(str(file_path)):
+            chunk = ibis.read_parquet(path)
+            # transform table using IbisML Recipe
+            chunk = data_type_recipes.fit(chunk).to_ibis(chunk)
+            chunks.append(chunk)
+        table = ibis.union(*chunks)
+    else:
+        # read a single file
+        table = ibis.read_parquet(file_path)
+        # transform table using IbisML
+        table = data_type_recipes.fit(table).to_ibis(table)
+
+    # perform aggregation if depth is 1 or 2
+    if depth in [1, 2]:
+        table = agg_by_id(table)
+
+    return table
+```
+
+Let's define two dictionaries, `train_data_store` and `test_data_store`, that organize and
+store processed datasets for training and testing datasets.
+
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to load all data into a dict"
+train_data_store = {
+    "df_base": read_and_process_files(TRAIN_DIR / "train_base.parquet"),
+    "depth_0": [
+        read_and_process_files(TRAIN_DIR / "train_static_cb_0.parquet"),
+        read_and_process_files(TRAIN_DIR / "train_static_0_*.parquet", is_regex=True),
+    ],
+    "depth_1": [
+        read_and_process_files(
+            TRAIN_DIR / "train_applprev_1_*.parquet", 1, is_regex=True
+        ),
+        read_and_process_files(TRAIN_DIR / "train_tax_registry_a_1.parquet", 1),
+        read_and_process_files(TRAIN_DIR / "train_tax_registry_b_1.parquet", 1),
+        read_and_process_files(TRAIN_DIR / "train_tax_registry_c_1.parquet", 1),
+        read_and_process_files(TRAIN_DIR / "train_credit_bureau_b_1.parquet", 1),
+        read_and_process_files(TRAIN_DIR / "train_other_1.parquet", 1),
+        read_and_process_files(TRAIN_DIR / "train_person_1.parquet", 1),
+        read_and_process_files(TRAIN_DIR / "train_deposit_1.parquet", 1),
+        read_and_process_files(TRAIN_DIR / "train_debitcard_1.parquet", 1),
+    ],
+    "depth_2": [
+        read_and_process_files(TRAIN_DIR / "train_credit_bureau_b_2.parquet", 2),
+    ],
+}
+# we won't be submitting the predictions, so let's comment out the test data.
+# test_data_store = {
+#     "df_base": read_and_process_files(TEST_DIR / "test_base.parquet"),
+#     "depth_0": [
+#         read_and_process_files(TEST_DIR / "test_static_cb_0.parquet"),
+#         read_and_process_files(TEST_DIR / "test_static_0_*.parquet", is_regex=True),
+#     ],
+#     "depth_1": [
+#         read_and_process_files(TEST_DIR / "test_applprev_1_*.parquet", 1, is_regex=True),
+#         read_and_process_files(TEST_DIR / "test_tax_registry_a_1.parquet", 1),
+#         read_and_process_files(TEST_DIR / "test_tax_registry_b_1.parquet", 1),
+#         read_and_process_files(TEST_DIR / "test_tax_registry_c_1.parquet", 1),
+#         read_and_process_files(TEST_DIR / "test_credit_bureau_b_1.parquet", 1),
+#         read_and_process_files(TEST_DIR / "test_other_1.parquet", 1),
+#         read_and_process_files(TEST_DIR / "test_person_1.parquet", 1),
+#         read_and_process_files(TEST_DIR / "test_deposit_1.parquet", 1),
+#         read_and_process_files(TEST_DIR / "test_debitcard_1.parquet", 1),
+#     ],
+#     "depth_2": [
+#         read_and_process_files(TEST_DIR / "test_credit_bureau_b_2.parquet", 2),
+#     ]
+# }
+```
+
+Join all features data to base table:
+
+```{python}
+#| code-fold: true
+#| code-summary: "Define function to join feature tables to base table"
+def join_data(df_base, depth_0, depth_1, depth_2):
+    for i, df in enumerate(depth_0 + depth_1 + depth_2):
+        df_base = df_base.join(
+            df, "case_id", how="left", rname="{name}_right" + f"_{i}"
+        )
+    return df_base
+```
+
+Generate train and test datasets:
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to generate train and test datasets"
+df_train = join_data(**train_data_store)
+# df_test = join_data(**test_data_store)
+total_rows = df_train.count().execute()
+print(f"There is {total_rows} rows and {len(df_train.columns)} columns")
+```
+### Select features
+Given the large number of features (~370), we'll focus on selecting just a few of the most
+informative ones by name for demonstration purposes in this post:
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to select important features for the train dataset"
+df_train = df_train.select(
+    "case_id",
+    "date_decision",
+    "target",
+    # number of credit bureau queries for the last X days.
+    "days30_165L",
+    "days360_512L",
+    "days90_310L",
+    # number of tax deduction payments
+    "pmtscount_423L",
+    # sum of tax deductions for the client
+    "pmtssum_45A",
+    "dateofbirth_337D",
+    "education_1103M",
+    "firstquarter_103L",
+    "secondquarter_766L",
+    "thirdquarter_1082L",
+    "fourthquarter_440L",
+    "maritalst_893M",
+    "numberofqueries_373L",
+    "requesttype_4525192L",
+    "responsedate_4527233D",
+    "actualdpdtolerance_344P",
+    "amtinstpaidbefduel24m_4187115A",
+    "annuity_780A",
+    "annuitynextmonth_57A",
+    "applicationcnt_361L",
+    "applications30d_658L",
+    "applicationscnt_1086L",
+    # average days past or before due of payment during the last 24 months.
+    "avgdbddpdlast24m_3658932P",
+    # average days past or before due of payment during the last 3 months.
+    "avgdbddpdlast3m_4187120P",
+    # end date of active contract.
+    "max_contractmaturitydate_151D",
+    # credit limit of an active loan.
+    "max_credlmt_1052A",
+    # number of credits in credit bureau
+    "max_credquantity_1099L",
+    "max_dpdmaxdatemonth_804T",
+    "max_dpdmaxdateyear_742T",
+    "max_maxdebtpduevalodued_3940955A",
+    "max_overdueamountmax_950A",
+    "max_purposeofcred_722M",
+    "max_residualamount_3940956A",
+    "max_totalamount_503A",
+    "max_cancelreason_3545846M",
+    "max_childnum_21L",
+    "max_currdebt_94A",
+    "max_employedfrom_700D",
+    # client's main income amount in their previous application
+    "max_mainoccupationinc_437A",
+    "max_profession_152M",
+    "max_rejectreason_755M",
+    "max_status_219L",
+    # credit amount of the active contract provided by the credit bureau
+    "max_amount_1115A",
+    # amount of unpaid debt for existing contracts
+    "max_debtpastduevalue_732A",
+    "max_debtvalue_227A",
+    "max_installmentamount_833A",
+    "max_instlamount_892A",
+    "max_numberofinstls_810L",
+    "max_pmtnumpending_403L",
+    "max_last180dayaveragebalance_704A",
+    "max_last30dayturnover_651A",
+    "max_openingdate_857D",
+    "max_amount_416A",
+    "max_amtdebitincoming_4809443A",
+    "max_amtdebitoutgoing_4809440A",
+    "max_amtdepositbalance_4809441A",
+    "max_amtdepositincoming_4809444A",
+    "max_amtdepositoutgoing_4809442A",
+    "max_empl_industry_691L",
+    "max_gender_992L",
+    "max_housingtype_772L",
+    "max_mainoccupationinc_384A",
+    "max_incometype_1044T",
+)
+
+df_train.head()
+```
+Univariate analysis:
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to describe the train dataset"
+# take the first 10 columns
+df_train[df_train.columns[:10]].describe()
+```
+
+## Last-mile data preprocessing
+We will perform the following transformation before feeding the data to models:
+
+* Missing value imputation
+* Encoding categorical variables
+* Handling date variables
+* Handling outliers
+* Scaling and normalization
+
+::: {.callout-note}
+IbisML provides a set of transformations. You can find the
+[roadmap](https://github.com/ibis-project/ibis-ml/issues/32).
+The [IbisML website](https://ibis-project.github.io/ibis-ml/) also includes tutorials and API documentation.
+:::
+
+### Impute features
+Impute all numeric columns using the median. In real-life scenarios, it's important to
+understand the meaning of each feature and apply the appropriate imputation method for
+different features. For more imputations, please refer to this
+[documentation](https://ibis-project.github.io/ibis-ml/reference/steps-imputation.html).
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to impute all numeric columns with median"
+step_impute_median = ml.ImputeMedian(ml.numeric())
+```
+
+### Encode categorical features
+Encode all categorical features using one-hot-encode. For more encoding steps,
+please refer to this
+[doc](https://ibis-project.github.io/ibis-ml/reference/steps-encoding.html).
+
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to one-hot encode selected columns"
+ohe_step = ml.OneHotEncode(
+    [
+        "maritalst_893M",
+        "requesttype_4525192L",
+        "max_profession_152M",
+        "max_gender_992L",
+        "max_empl_industry_691L",
+        "max_housingtype_772L",
+        "max_incometype_1044T",
+        "max_cancelreason_3545846M",
+        "max_rejectreason_755M",
+        "education_1103M",
+        "max_status_219L",
+    ]
+)
+```
+
+### Handle date variables
+Calculate all the days difference between any date columns and the column `date_decision`:
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to calculate days difference between date columns and date_decision"
+date_cols = [col_name for col_name in df_train.columns if col_name[-1] == "D"]
+days_to_decision_expr = {
+    # difference in days
+    f"{col}_date_decision_diff": (
+        _.date_decision.epoch_seconds() - getattr(_, col).epoch_seconds()
+    )
+    / (60 * 60 * 24)
+    for col in date_cols
+}
+days_to_decision_step = ml.Mutate(days_to_decision_expr)
+```
+Extract information from the date columns:
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to extract day and week info from date columns"
+# dow and month is set to catagoery
+expand_date_step = ml.ExpandDate(ml.date(), ["week", "day"])
+```
+
+### Handle outliers
+Capping outliers using `z-score` method:
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to cap outliers for selected columns"
+step_handle_outliers = ml.HandleUnivariateOutliers(
+    ["max_amount_1115A", "max_overdueamountmax_950A"],
+    method="z-score",
+    treatment="capping",
+    deviation_factor=3,
+)
+```
+
+### Construct recipe
+We'll construct the last mile preprocessing [recipe](https://ibis-project.github.io/ibis-ml/reference/core.html#ibis_ml.Recipe)
+by chaining all transformation steps, which will be fitted to the training dataset and later applied test datasets.
+
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to construct the recipe"
+last_mile_preprocessing = ml.Recipe(
+    expand_date_step,
+    ml.Drop(ml.date()),
+    # handle string columns
+    ohe_step,
+    ml.Drop(ml.string()),
+    # handle numeric cols
+    # capping outliers
+    step_handle_outliers,
+    step_impute_median,
+    ml.ScaleMinMax(ml.numeric()),
+    # fill missing value
+    ml.FillNA(ml.numeric(), 0),
+    ml.Cast(ml.numeric(), "float32"),
+)
+print(f"Last-mile preprocessing recipe: \n{last_mile_preprocessing}")
+```
+
+## Modeling
+After completing data preprocessing with Ibis and IbisML, we proceed to the modeling
+phase. Here are two approaches:
+
+* Use IbisML as a independent data preprocessing component and hand off the data to downstream modeling
+frameworks with various output formats:
+     - pandas Dataframe
+     - NumPy Array
+     - Polars Dataframe
+     - Dask Dataframe
+     - xgboost.DMatrix
+     - Pyarrow Table
+* Use IbisML recipes as components within an sklearn Pipeline and
+train models similarly to how you would do with sklearn pipeline.
+
+We will build an XGBoost model within a scikit-learn pipeline, and a neural network classifier using the
+output transformed by IbisML recipes.
+
+### Train and test data splitting
+We'll use hashing on the unique key to consistently split rows to different groups.
+Hashing is robust to underlying changes in the data, such as adding, deleting, or
+reordering rows. This deterministic process ensures that each data point is always
+assigned to the same split, thereby enhancing reproducibility.
+
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to split data into train and test"
+import random
+
+# this enables the analysis to be reproducible when random numbers are used
+random.seed(222)
+random_key = str(random.getrandbits(256))
+
+# put 3/4 of the data into the training set
+df_train = df_train.mutate(
+    train_flag=(df_train.case_id.cast(dt.str) + random_key).hash().abs() % 4 < 3
+)
+# split the dataset by train_flag
+# todo: use ml.train_test_split() after next release
+train_data = df_train[df_train.train_flag].drop("train_flag")
+test_data = df_train[~df_train.train_flag].drop("train_flag")
+
+X_train = train_data.drop("target")
+y_train = train_data.target.cast(dt.float32).name("target")
+
+X_test = test_data.drop("target")
+y_test = test_data.target.cast(dt.float32).name("target")
+
+train_cnt = X_train.count().execute()
+test_cnt = X_test.count().execute()
+print(f"train dataset size = {train_cnt} \ntest data size = {test_cnt}")
+```
+
+::: {.callout-warning}
+Hashing provides a consistent but pseudo-random distribution of data, which
+may not precisely align with the specified train/test ratio. While hash codes
+ensure reproducibility, they don't guarantee an exact split. Due to statistical variance,
+you might find a slight imbalance in the distribution, resulting in marginally more or
+fewer samples in either the training or test dataset than the target percentage. This
+minor deviation from the intended ratio is a normal consequence of hash-based
+partitioning.
+:::
+
+### XGBoost
+In this section, we integrate XGBoost into a scikit-learn pipeline to create a
+streamlined workflow for training and evaluating our model.
+
+We'll set up a pipeline that includes two components:
+
+* **Preprocessing**: This step applies the `last_mile_preprocessing` for final data preprocessing.
+* **Modeling**: This step applies the `xgb.XGBClassifier()` to train the XGBoost model.
+
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to built and fit the pipeline"
+from sklearn.pipeline import Pipeline
+from sklearn.metrics import roc_auc_score
+import xgboost as xgb
+
+model = xgb.XGBClassifier(
+    n_estimators=100,
+    max_depth=5,
+    learning_rate=0.05,
+    subsample=0.8,
+    colsample_bytree=0.8,
+    random_state=42,
+)
+# create the pipeline with the last mile ML recipes and the model
+pipe = Pipeline([("last_mile_recipes", last_mile_preprocessing), ("model", model)])
+# fit the pipeline on the training data
+pipe.fit(X_train, y_train)
+```
+
+Let's evaluate the model on the test data using Gini index:
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to calculate the Gini score for the test dataset"
+y_pred_proba = pipe.predict_proba(X_test)[:, 1]
+# calculate the AUC score
+auc = roc_auc_score(y_test, y_pred_proba)
+
+# calculate the Gini score
+gini_score = 2 * auc - 1
+print(f"gini_score for test dataset: {gini_score:,}")
+```
+
+::: {.callout-note}
+The competition is evaluated using a Gini stability metric. For more information, see the
+[evaluation guidelines](https://www.kaggle.com/competitions/home-credit-credit-risk-model-stability/overview/evaluation)
+:::
+
+### Neural network classifier
+Build a neural network classifier using PyTorch and PyTorch Lightning.
+
+::: {.callout-warning}
+It is not recommended to build a neural network classifier for this competition, we are building
+it solely for demonstration purposes.
+:::
+
+We'll demonstrate how to build a model by directly passing the data to it. IbisML recipes can output
+data in various formats, making it compatible with different modeling frameworks.
+Let's first train the recipe:
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to train the IbisML recipe"
+# train preprocessing recipe using training dataset
+last_mile_preprocessing.fit(X_train, y_train)
+```
+
+In the previous cell, we trained the recipe using the training dataset. Now, we will
+transform both the train and test datasets using the same recipe. The default output format is a `NumPy array`
+
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to transform the datasets using fitted recipe"
+# transform train and test dataset using IbisML recipe
+X_train_transformed = last_mile_preprocessing.transform(X_train)
+X_test_transformed = last_mile_preprocessing.transform(X_test)
+print(f"train data shape = {X_train_transformed.shape}")
+print(f"test data shape = {X_test_transformed.shape}")
+```
+
+Let's define a neural network classifier using PyTorch and PyTorch Lighting:
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to define a torch classifier"
+import numpy as np
+import torch
+import torch.nn as nn
+import torch.optim as optim
+from torch.utils.data import DataLoader, TensorDataset
+import pytorch_lightning as pl
+from pytorch_lightning import Trainer
+
+
+class NeuralNetClassifier(pl.LightningModule):
+    def __init__(self, input_dim, hidden_dim=8, output_dim=1):
+        super().__init__()
+        self.model = nn.Sequential(
+            nn.Linear(input_dim, hidden_dim),
+            nn.ReLU(),
+            nn.Linear(hidden_dim, output_dim),
+        )
+        self.loss = nn.BCEWithLogitsLoss()
+        self.sigmoid = nn.Sigmoid()
+
+    def forward(self, x):
+        return self.model(x)
+
+    def training_step(self, batch, batch_idx):
+        x, y = batch
+        y_hat = self(x)
+        loss = self.loss(y_hat.view(-1), y)
+        self.log("train_loss", loss)
+        return loss
+
+    def validation_step(self, batch, batch_idx):
+        x, y = batch
+        y_hat = self(x)
+        loss = self.loss(y_hat.view(-1), y)
+        self.log("val_loss", loss)
+        return loss
+
+    def configure_optimizers(self):
+        return optim.Adam(self.parameters(), lr=0.001)
+
+    def predict_proba(self, x):
+        self.eval()
+        with torch.no_grad():
+            x = x.to(self.device)
+            return self.sigmoid(self(x))
+
+# initialize your Lightning Module
+nn_classifier = NeuralNetClassifier(input_dim=X_train_transformed.shape[1])
+```
+
+Now, we'll create the PyTorch DataLoader using the output from IbisML:
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to convert IbisML output to tensor"
+y_train_array = y_train.to_pandas().to_numpy().astype(np.float32)
+x_train_tensor = torch.from_numpy(X_train_transformed)
+y_train_tensor = torch.from_numpy(y_train_array)
+train_dataset = TensorDataset(x_train_tensor, y_train_tensor)
+
+y_test_array = y_test.to_pandas().to_numpy().astype(np.float32)
+X_test_tensor = torch.from_numpy(X_test_transformed)
+y_test_tensor = torch.from_numpy(y_test_array)
+val_dataset = TensorDataset(X_test_tensor, y_test_tensor)
+
+train_loader = DataLoader(train_dataset, batch_size=32, shuffle=False)
+val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False)
+```
+
+Initialize the PyTorch Lightning Trainer:
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to construct PyTorch Lightning Trainer"
+# initialize a Trainer
+trainer = Trainer(max_epochs=2)
+print(nn_classifier)
+```
+
+Let's train the classifier:
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to train the pytorch classifier"
+# train the model
+trainer.fit(nn_classifier, train_loader, val_loader)
+```
+
+Let's use the trained model to make a prediction:
+```{python}
+#| code-fold: true
+#| code-summary: "Show code to predict using the trained pytorch classifier"
+y_pred = nn_classifier.predict_proba(X_test_tensor[:10])
+y_pred
+```
+
+## Takeaways
+IbisML provides a powerful suite of last-mile preprocessing transformations, including an advanced column selector
+that streamlines the selection and transformation of specific columns in your dataset.
+
+It integrates seamlessly with scikit-learn pipelines, allowing you to incorporate preprocessing recipes directly into
+your workflow. Additionally, IbisML supports a variety of data output formats such as Dask, NumPy, and Arrow, ensuring
+compatibility with different machine learning frameworks.
+
+Another key advantage of IbisML is its flexibility in performing data preprocessing across multiple backends, including
+DuckDB, Polars, Spark, BigQuery, and other Ibis backends. This enables you to preprocess your training data
+using the backend that best suits your needs, whether for large or small datasets, on local machines or compute backends,
+and in both development and production environments. Stay tuned for a future post where we will explore this capability in
+more detail.
+
+## Reference
+* [1st Place Solution](https://www.kaggle.com/code/yuuniekiri/fork-of-home-credit-risk-lightgbm)
+* [home-credit-2024-starter-notebook](https://www.kaggle.com/code/jetakow/home-credit-2024-starter-notebook)
+* [EDA and Submission](https://www.kaggle.com/competitions/home-credit-credit-risk-model-stability/discussion/508337)
+* [Home Credit Baseline](https://www.kaggle.com/code/greysky/home-credit-baseline)