diff --git a/docs/reference/cli.md b/docs/reference/cli.md index 2bf47b31..86c23dce 100644 --- a/docs/reference/cli.md +++ b/docs/reference/cli.md @@ -1,6 +1,16 @@ # NePS Command Line Interface This section provides a brief overview of the commands available in the NePS CLI. +!!! note "Support of Development and Task ID" + The NePS arguments `development_stage_id` and `task_id` are only partially + supported. To retrieve results for a specific task or development stage, you must modify the `root_directory` to + point to the corresponding folder of your NePS results. For example, if you have task_id 1 and development_stage_id 4, + update your root_directory to root_directory/task_1/development_4. This can be done either by specifying the + --root-directory option in your command or by updating the root_directory in your corresponding `run_args` yaml + file. + + +--- ## **`init` Command** Generates a default `run_args` YAML configuration file, providing a template that you can customize for your experiments. @@ -11,7 +21,7 @@ Generates a default `run_args` YAML configuration file, providing a template tha - `-h, --help` (Optional): show this help message and exit - `--config-path` (Optional): Optional custom path for generating the configuration file. Default is 'run_config.yaml'. - `--template` (Optional): Optional, options between different templates. Required configs(basic) vs all neps configs (complete) -- `--state-machine` (Optional): If set, creates a NEPS state. Requires an existing config.yaml. +- `--database` (Optional): If set, creates the NePS database. This is required if you want to sample and report configurations using only CLI commands. Requires an existing config.yaml. **Example Usage:** @@ -20,7 +30,7 @@ Generates a default `run_args` YAML configuration file, providing a template tha neps init --config-path custom/path/config.yaml --template complete ``` - +--- ## **`run` Command** Executes the optimization based on the provided configuration. This command serves as a CLI wrapper around `neps.run`, effectively mapping each CLI argument to a parameter in `neps.run`. It offers a flexible interface that allows you to override the existing settings specified in the YAML configuration file, facilitating dynamic adjustments for managing your experiments. @@ -56,7 +66,7 @@ Executes the optimization based on the provided configuration. This command serv neps run --run-args path/to/config.yaml --max-evaluations-total 50 ``` - +--- ## **`status` Command** Check the status of the NePS run. This command provides a summary of trials, including pending, evaluating, succeeded, and failed trials. You can filter the trials displayed based on their state. @@ -64,7 +74,7 @@ Check the status of the NePS run. This command provides a summary of trials, inc - `-h, --help` (Optional): show this help message and exit -- `--root-directory` (Optional): Optional: The path to your root_directory. If not provided, it will be loaded from run_config.yaml. +- `--root-directory` (Optional): The path to your root_directory. If not provided, it will be loaded from run_config.yaml. - `--pending` (Optional): Show only pending trials. - `--evaluating` (Optional): Show only evaluating trials. - `--succeeded` (Optional): Show only succeeded trials. @@ -75,7 +85,7 @@ Check the status of the NePS run. This command provides a summary of trials, inc neps status --root-directory path/to/directory --succeeded ``` - +--- ## **`info-config` Command** Provides detailed information about a specific configuration identified by its ID. This includes metadata, configuration values, and trial status. @@ -94,16 +104,24 @@ Provides detailed information about a specific configuration identified by its I neps info-config 42 --root-directory path/to/directory ``` +--- ## **`results` Command** Displays the results of the NePS run, listing all incumbent trials in reverse order (most recent first). Optionally, -you can plot the results to visualize the progression of incumbents over trials. +you can plot the results to visualize the progression of incumbents over trials. Additionally, you can dump all +trials or incumbent trials to a file in the specified format and plot the results to visualize the progression of +incumbents over trials. **Arguments:** - `-h, --help` (Optional): show this help message and exit - `--root-directory` (Optional): Optional: The path to your root_directory. If not provided, it will be loaded from run_config.yaml. -- `--plot` (Optional): Plot the results if set. +- `--plot` (Optional): Plot the incumbents if set. +- `--dump-all-configs` (Optional): Dump all information about the trials to a file in the specified format (csv, json, + parquet). +- `--dump-incumbents` (Optional): Dump only the information about the incumbent trials to a file in the specified + format (csv, json, parquet). + **Example Usage:** @@ -112,8 +130,7 @@ you can plot the results to visualize the progression of incumbents over trials. neps results --root-directory path/to/directory --plot ``` - - +--- ## **`errors` Command** Lists all errors found in the specified NePS run. This is useful for debugging or reviewing failed trials. @@ -131,23 +148,61 @@ neps errors --root-directory path/to/directory ``` +--- ## **`sample-config` Command** +The sample-config command allows users to generate new configurations based on the current state of the +NePS optimizer. This is particularly useful when you need to manually intervene in the sampling process, such +as allocating different computational resources to different configurations. +!!! note "Note" + Before using the `sample-config` command, you need to initialize the database by running `neps init --database` if you haven't already executed `neps run`. Running `neps run` will also create a `NePsState`. **Arguments:** - - `-h, --help` (Optional): show this help message and exit -- `--root-directory` (Optional): Optional: The path to your root_directory. If not provided, it will be loaded from run_config.yaml. +- `--worker-id` (Optional): The worker ID for which the configuration is being sampled. +- `--run-args` (Optional): Path to the YAML configuration file. If not provided, it will search after run_config.yaml. +- `--number-of-configs` (Optional): Number of configurations to sample (default: 1). **Example Usage:** + ```bash -neps sample-config --help +neps sample-config --worker-id worker_1 --number-of-configs 5 ``` +--- +## **`report-config` Command** +The `report-config` command is the counterpart to `sample-config` and reports the outcome of a specific trial by updating its status and associated metrics in the NePS state. This command is crucial for manually managing the evaluation results of sampled configurations. + +**Arguments:** + + +- `` (Required): ID of the trial to report +- `` (Required): Outcome of the trial + + +- `-h, --help` (Optional): show this help message and exit +- `--worker-id` (Optional): The worker ID for which the configuration is being sampled. +- `--loss` (Optional): Loss value of the trial +- `--run-args` (Optional): Path to the YAML file containing run configurations +- `--cost` (Optional): Cost value of the trial +- `--learning-curve` (Optional): Learning curve as a list of floats, provided like this --learning-curve 0.9 0.3 0.1 +- `--duration` (Optional): Duration of the evaluation in sec +- `--err` (Optional): Error message if any +- `--tb` (Optional): Traceback information if any +- `--time-end` (Optional): The time the trial ended as either a UNIX timestamp (float) or in 'YYYY-MM-DD HH:MM:SS' format + + +**Example Usage:** + + +```bash +neps report-config 42 success --worker-id worker_1 --loss 0.95 --duration 120 +``` +--- ## **`help` Command** Displays help information for the NePS CLI, including a list of available commands and their descriptions. @@ -163,3 +218,53 @@ Displays help information for the NePS CLI, including a list of available comman neps help --help ``` +--- +## **Using NePS as a State Machine** + +NePS can function as a state machine, allowing you to manually sample and report configurations using CLI commands. This is particularly useful in scenarios like architecture search, where different configurations may require varying computational resources. To utilize NePS in this manner, follow these steps: + +### **Step 1: Initialize and Configure `run_config.yaml** + +Begin by generating the `run_args` YAML configuration file. This file serves as the blueprint for your optimization experiments. + + +```bash +neps init +``` +The `neps init` command creates run_config.yaml, which serves as the default configuration resource for all NePS commands. +### **Step 2: Initialize the NePS Database** + +Set up the NePS database to enable the sampling and reporting of configurations via CLI commands. + +```bash +neps init --database +``` +This command initializes the NePS database, preparing the necessary folders and files required for managing your NePS run + + +### **Step 3: Sample Configurations** + +Generate new configurations based on the existing NePS state. This step allows you to create configurations that you can manually evaluate. + +```bash +neps sample-config --worker-id worker_1 --number-of-configs 5 +``` + +- **`--worker_id worker_1`**: Identifies the worker responsible for sampling configurations. +- **`--number-of-configs 5`**: Specifies the number of configurations to sample. + +### **Step 4: Evaluate and Report Configurations** + +After evaluating each sampled configuration, report its outcome to update the NePS state. + +```bash +neps report-config 42 success --worker-id worker_1 --loss 0.95 --duration 120 +``` + +- **`42`**: The ID of the trial being reported. +- **`success`**: The outcome of the trial (`success`, `failed`, `crashed`). +- **`--worker_id worker_1`**: Identifies the worker reporting the configuration. +- **`--loss 0.95`**: The loss value obtained from the trial. +- **`--duration 120`**: The duration of the evaluation in seconds. + + diff --git a/neps/api.py b/neps/api.py index d4878c88..a140a7f6 100644 --- a/neps/api.py +++ b/neps/api.py @@ -278,7 +278,7 @@ def _run_args( "mobster", "asha", ] - | BaseOptimizer + | BaseOptimizer | dict ) = "default", **searcher_kwargs, ) -> tuple[BaseOptimizer, dict]: diff --git a/neps/utils/cli.py b/neps/utils/cli.py index b504bf57..1ad92d4a 100644 --- a/neps/utils/cli.py +++ b/neps/utils/cli.py @@ -3,15 +3,12 @@ from __future__ import annotations import warnings +from typing import Tuple from datetime import timedelta, datetime import seaborn as sns import matplotlib.pyplot as plt import os import numpy as np - -# Suppress specific warnings -# warnings.filterwarnings("ignore", category=UserWarning, module="torch._utils") -from neps.state.trial import Trial import argparse import logging import yaml @@ -19,44 +16,156 @@ from typing import Optional, List import neps from neps.api import Default +from neps.status.status import post_run_csv +import pandas as pd +from neps.utils.run_args import ( + RUN_ARGS, + RUN_PIPELINE, + ROOT_DIRECTORY, + POST_RUN_SUMMARY, + MAX_EVALUATIONS_PER_RUN, + MAX_EVALUATIONS_TOTAL, + MAX_COST_TOTAL, + PIPELINE_SPACE, + DEVELOPMENT_STAGE_ID, + TASK_ID, + SEARCHER, + SEARCHER_KWARGS, + IGNORE_ERROR, + LOSS_VALUE_ON_ERROR, + COST_VALUE_ON_ERROR, + CONTINUE_UNTIL_MAX_EVALUATION_COMPLETED, + OVERWRITE_WORKING_DIRECTORY, + get_run_args_from_yaml, +) +from neps.optimizers.base_optimizer import BaseOptimizer from neps.utils.run_args import load_and_return_object from neps.state.filebased import ( create_or_load_filebased_neps_state, load_filebased_neps_state, ) +from neps.state.neps_state import NePSState +from neps.state.trial import Trial from neps.exceptions import VersionedResourceDoesNotExistsError, TrialNotFoundError from neps.status.status import get_summary_dict +from neps.api import _run_args +from neps.state.optimizer import BudgetInfo, OptimizationState, OptimizerInfo + +# Suppress specific warnings +warnings.filterwarnings("ignore", category=UserWarning, module="torch._utils") + + +def validate_directory(path: Path) -> bool: + """ + Validates whether the given path exists and is a directory. + + Args: + path (Path): The path to validate. + + Returns: + bool: True if valid, False otherwise. + """ + if not path.exists(): + print(f"Error: The directory '{path}' does not exist.") + return False + if not path.is_dir(): + print(f"Error: The path '{path}' exists but is not a directory.") + return False + return True -def get_root_directory(args: argparse.Namespace) -> Path: - """Load the root directory from the provided argument or from the config.yaml file.""" +def get_root_directory(args: argparse.Namespace) -> Optional[Path]: + # Command-line argument handling if args.root_directory: - return Path(args.root_directory) + root_dir = Path(args.root_directory) + if validate_directory(root_dir): + return root_dir + else: + return None - config_path = Path("run_config.yaml") + # Configuration file handling + config_path = Path("run_config.yaml").resolve() if config_path.exists(): - with config_path.open("r") as file: - config = yaml.safe_load(file) - root_directory = config.get("root_directory") + try: + with config_path.open("r") as file: + config = yaml.safe_load(file) + except yaml.YAMLError as e: + print(f"Error parsing '{config_path}': {e}") + return None + + root_directory = config.get(ROOT_DIRECTORY) if root_directory: - return Path(root_directory) + root_directory_path = Path(root_directory) + if validate_directory(root_directory_path): + return root_directory_path + else: + return None else: - raise ValueError( - "The config.yaml file exists but does not contain 'root_directory'." + print( + "Error: The 'run_config.yaml' file exists but does not contain the " + "'root_directory' key." ) + return None else: - raise ValueError( - "Either the root_directory must be provided as an argument or config.yaml " - "must exist with a 'root_directory' key." + print( + "Error: 'root_directory' must be provided as a command-line argument " + "or defined in 'run_config.yaml'." ) + return None def init_config(args: argparse.Namespace) -> None: """Creates a 'run_args' configuration YAML file template if it does not already exist. """ - config_path = Path(args.config_path) if args.config_path else Path("run_config.yaml") - if not config_path.exists(): + config_path = ( + Path(args.config_path).resolve() + if args.config_path + else Path("run_config.yaml").resolve() + ) + + if args.database: + if config_path.exists(): + run_args = get_run_args_from_yaml(config_path) + max_cost_total = run_args.get(MAX_COST_TOTAL) + # Create the optimizer + _, optimizer_info = load_optimizer(run_args) + if optimizer_info is None: + return + + try: + directory = run_args.get(ROOT_DIRECTORY) + if directory is None: + return + else: + directory = Path(directory) + is_new = not directory.exists() + _ = create_or_load_filebased_neps_state( + directory=directory, + optimizer_info=OptimizerInfo(optimizer_info), + optimizer_state=OptimizationState( + budget=( + BudgetInfo(max_cost_budget=max_cost_total, used_cost_budget=0) + if max_cost_total is not None + else None + ), + shared_state={}, # TODO: Unused for the time being... + ), + ) + if is_new: + print("NePS state was successfully created.") + else: + print("NePS state was already created.") + except Exception as e: + print(f"Error creating neps state: {e}") + else: + print( + f"{config_path} does not exist. Make sure that your configuration " + f"file already exists if you don't have specified your own path. " + f"Run 'neps init' to create run_config.yaml" + ) + + elif not config_path.exists(): with config_path.open("w") as file: template = args.template if args.template else "basic" if template == "basic": @@ -133,9 +242,6 @@ def init_config(args: argparse.Namespace) -> None: pre_load_hooks: """ ) - elif args.state_machine: - pass - # create_or_load_filebased_neps_state() else: print(f"Path {config_path} does already exist.") @@ -176,10 +282,13 @@ def run_optimization(args: argparse.Namespace) -> None: """Collects arguments from the parser and runs the NePS optimization. Args: args (argparse.Namespace): Parsed command-line arguments. """ + if isinstance(args.run_args, Default): + run_args = Path("run_config.yaml") + else: + run_args = args.run_args if not isinstance(args.run_pipeline, Default): - print("fehler") module_path, function_name = args.run_pipeline.split(":") - run_pipeline = load_and_return_object(module_path, function_name, "run_pipeline") + run_pipeline = load_and_return_object(module_path, function_name, RUN_PIPELINE) else: run_pipeline = args.run_pipeline @@ -190,24 +299,24 @@ def run_optimization(args: argparse.Namespace) -> None: # Collect arguments from args and prepare them for neps.run options = { - "run_args": args.run_args, - "run_pipeline": run_pipeline, - "pipeline_space": args.pipeline_space, - "root_directory": args.root_directory, - "overwrite_working_directory": args.overwrite_working_directory, - "post_run_summary": args.post_run_summary, - "development_stage_id": args.development_stage_id, - "task_id": args.task_id, - "max_evaluations_total": args.max_evaluations_total, - "max_evaluations_per_run": args.max_evaluations_per_run, - "continue_until_max_evaluation_completed": ( + RUN_ARGS: run_args, + RUN_PIPELINE: run_pipeline, + PIPELINE_SPACE: args.pipeline_space, + ROOT_DIRECTORY: args.root_directory, + OVERWRITE_WORKING_DIRECTORY: args.overwrite_working_directory, + POST_RUN_SUMMARY: args.post_run_summary, + DEVELOPMENT_STAGE_ID: args.development_stage_id, + TASK_ID: args.task_id, + MAX_EVALUATIONS_TOTAL: args.max_evaluations_total, + MAX_EVALUATIONS_PER_RUN: args.max_evaluations_per_run, + CONTINUE_UNTIL_MAX_EVALUATION_COMPLETED: ( args.continue_until_max_evaluation_completed ), - "max_cost_total": args.max_cost_total, - "ignore_errors": args.ignore_errors, - "loss_value_on_error": args.loss_value_on_error, - "cost_value_on_error": args.cost_value_on_error, - "searcher": args.searcher, + MAX_COST_TOTAL: args.max_cost_total, + IGNORE_ERROR: args.ignore_errors, + LOSS_VALUE_ON_ERROR: args.loss_value_on_error, + COST_VALUE_ON_ERROR: args.cost_value_on_error, + SEARCHER: args.searcher, **kwargs, } logging.basicConfig(level=logging.INFO) @@ -218,18 +327,12 @@ def info_config(args: argparse.Namespace) -> None: """Handles the info-config command by providing information based on directory and id.""" directory_path = get_root_directory(args) + if directory_path is None: + return config_id = args.id - if not directory_path.exists() or not directory_path.is_dir(): - print( - f"Error: The directory {directory_path} does not exist or is not a " - f"directory." - ) - return - try: - neps_state = load_filebased_neps_state(directory_path) - except VersionedResourceDoesNotExistsError: - print(f"No NePS state found in the directory {directory_path}.") + neps_state = load_neps_state(directory_path) + if neps_state is None: return try: trial = neps_state.get_trial_by_id(config_id) @@ -259,6 +362,12 @@ def info_config(args: argparse.Namespace) -> None: print(f" Loss: {trial.report.loss}") print(f" Cost: {trial.report.cost}") print(f" Reported As: {trial.report.reported_as}") + error = trial.report.err + if error is not None: + print(f" Error Type: {type(error).__name__}") + print(f" Error Message: {str(error)}") + print(f" Traceback:") + print(f" {trial.report.tb}") else: print("No report available.") @@ -266,18 +375,11 @@ def info_config(args: argparse.Namespace) -> None: def load_neps_errors(args: argparse.Namespace) -> None: """Handles the 'errors' command by loading errors from the neps_state.""" directory_path = get_root_directory(args) - - if not directory_path.exists() or not directory_path.is_dir(): - print( - f"Error: The directory {directory_path} does not exist or is not a " - f"directory." - ) + if directory_path is None: return - try: - neps_state = load_filebased_neps_state(directory_path) - except VersionedResourceDoesNotExistsError: - print(f"No NePS state found in the directory {directory_path}.") + neps_state = load_neps_state(directory_path) + if neps_state is None: return errors = neps_state.get_errors() @@ -299,22 +401,70 @@ def load_neps_errors(args: argparse.Namespace) -> None: def sample_config(args: argparse.Namespace) -> None: - """Handles the sample-config command""" - # Get the root_directory from args or load it from run_config.yaml - directory_path = get_root_directory(args) - neps_state = load_filebased_neps_state(directory_path) + """Handles the sample-config command which samples configurations from the NePS + state.""" + # Load run_args from the provided path or default to run_config.yaml + if args.run_args: + run_args_path = Path(args.run_args) + else: + run_args_path = Path("run_config.yaml") + + if not run_args_path.exists(): + print(f"Error: run_args file {run_args_path} does not exist.") + return + + run_args = get_run_args_from_yaml(run_args_path) + + # Get root_directory from the run_args + root_directory = run_args.get(ROOT_DIRECTORY) + if not root_directory: + print("Error: 'root_directory' is not specified in the run_args file.") + return + + root_directory = Path(root_directory) + if not root_directory.exists(): + print(f"Error: The directory {root_directory} does not exist.") + return + + neps_state = load_neps_state(root_directory) + if neps_state is None: + return + + # Get the worker_id and number_of_configs from arguments + worker_id = args.worker_id + num_configs = args.number_of_configs if args.number_of_configs else 1 - # Placeholder for the logic that will be implemented - pass + optimizer, _ = load_optimizer(run_args) + if optimizer is None: + return + + # Sample trials + for _ in range(num_configs): + try: + trial = neps_state.sample_trial(optimizer, worker_id=worker_id) + except Exception as e: + print(f"Error during configuration sampling: {e}") + continue # Skip to the next iteration + print(f"Sampled configuration with Trial ID: {trial.id}") + print(f"Location: {trial.metadata.location}") + print("Configuration:") + for key, value in trial.config.items(): + print(f" {key}: {value}") + print("\n") -def convert_timestamp(timestamp: float) -> str: + +def convert_timestamp(timestamp: float | None) -> str: """Convert a UNIX timestamp to a human-readable datetime string.""" + if timestamp is None: + return "None" return datetime.fromtimestamp(timestamp).strftime("%Y-%m-%d %H:%M:%S") -def format_duration(seconds: float) -> str: +def format_duration(seconds: float | None) -> str: """Convert duration in seconds to a h:min:sec format.""" + if seconds is None: + return "None" duration = str(timedelta(seconds=seconds)) # Remove milliseconds for alignment if "." in duration: @@ -331,18 +481,11 @@ def status(args: argparse.Namespace) -> None: """Handles the status command, providing a summary of the NEPS run.""" # Get the root_directory from args or load it from run_config.yaml directory_path = get_root_directory(args) - - if not directory_path.exists() or not directory_path.is_dir(): - print( - f"Error: The directory {directory_path} does not exist or is not a " - f"directory." - ) + if directory_path is None: return - try: - neps_state = load_filebased_neps_state(directory_path) - except VersionedResourceDoesNotExistsError: - print(f"No NePS state found in the directory {directory_path}.") + neps_state = load_neps_state(directory_path) + if neps_state is None: return summary = get_summary_dict(directory_path, add_details=True) @@ -356,7 +499,6 @@ def status(args: argparse.Namespace) -> None: pending_trials_count = summary["num_pending_configs"] succeeded_trials_count = summary["num_evaluated_configs"] - summary["num_error"] failed_trials_count = summary["num_error"] - pending_with_worker_count = summary["num_pending_configs_with_worker"] # Print summary print("NePS Status:") @@ -366,7 +508,6 @@ def status(args: argparse.Namespace) -> None: print(f"Failed Trials (Errors): {failed_trials_count}") print(f"Active Trials: {evaluating_trials_count}") print(f"Pending Trials: {pending_trials_count}") - print(f"Pending Trials with Worker: {pending_with_worker_count}") print(f"Best Loss Achieved: {summary['best_loss']}") print("\nLatest Trials:") @@ -464,6 +605,194 @@ def status(args: argparse.Namespace) -> None: print("-----------------------------") +def results(args: argparse.Namespace) -> None: + """Handles the 'results' command by displaying incumbents, optionally plotting, + and dumping results to files based on the specified options.""" + directory_path = get_root_directory(args) + if directory_path is None: + return + + # Attempt to generate the summary CSV + try: + csv_config_data_path, _ = post_run_csv(directory_path) + except Exception as e: + print(f"Error generating summary CSV: {e}") + return + + summary_csv_dir = csv_config_data_path.parent # 'summary_csv' directory + + # Load NePS state + neps_state = load_neps_state(directory_path) + if neps_state is None: + return + + def sort_trial_id(trial_id: str) -> List[int]: + parts = trial_id.split("_") # Split the ID by '_' + # Convert each part to an integer for proper numeric sorting + return [int(part) for part in parts] + + trials = neps_state.get_all_trials() + sorted_trials = sorted(trials.values(), key=lambda x: sort_trial_id(x.id)) + + # Compute incumbents + incumbents = compute_incumbents(sorted_trials) + incumbents_ids = [trial.id for trial in incumbents] + + # Handle Dump Options + if args.dump_all_configs or args.dump_incumbents: + if args.dump_all_configs: + dump_all_configs(csv_config_data_path, summary_csv_dir, args.dump_all_configs) + return + + if args.dump_incumbents: + dump_incumbents( + csv_config_data_path, + summary_csv_dir, + args.dump_incumbents, + incumbents_ids, + ) + return + + # Display Results + display_results(directory_path, incumbents) + + # Handle Plotting + if args.plot: + plot_path = plot_incumbents(sorted_trials, incumbents, summary_csv_dir) + print(f"Plot saved to '{plot_path}'.") + + +def load_neps_state(directory_path: Path) -> Optional[NePSState[Path]]: + """Load the NePS state with error handling.""" + try: + return load_filebased_neps_state(directory_path) + except VersionedResourceDoesNotExistsError: + print(f"Error: No NePS state found in the directory '{directory_path}'.") + print("Ensure that the NePS run has been initialized correctly.") + except Exception as e: + print(f"Unexpected error loading NePS state: {e}") + return None + + +def compute_incumbents(sorted_trials: List[Trial]) -> List[Trial]: + """Compute the list of incumbent trials based on the best loss.""" + best_loss = float("inf") + incumbents = [] + for trial in sorted_trials: + if trial.report and trial.report.loss < best_loss: + best_loss = trial.report.loss + incumbents.append(trial) + return incumbents[::-1] # Reverse for most recent first + + +def dump_all_configs( + csv_config_data_path: Path, summary_csv_dir: Path, dump_format: str +) -> None: + """Dump all configurations to the specified format.""" + dump_format = dump_format.lower() + supported_formats = ["csv", "json", "parquet"] + if dump_format not in supported_formats: + print( + f"Unsupported dump format: '{dump_format}'. " + f"Supported formats are: {supported_formats}." + ) + return + + base_name = csv_config_data_path.stem # 'config_data' + + if dump_format == "csv": + # CSV is already available + print( + f"All trials successfully dumped to '{summary_csv_dir}/{base_name}.{dump_format}'." + ) + else: + # Define output file path with desired extension + output_file_name = f"{base_name}.{dump_format}" + output_file_path = summary_csv_dir / output_file_name + + try: + # Read the existing CSV into DataFrame + df = pd.read_csv(csv_config_data_path) + + # Save to the desired format + if dump_format == "json": + df.to_json(output_file_path, orient="records", indent=4) + elif dump_format == "parquet": + df.to_parquet(output_file_path, index=False) + + print(f"All trials successfully dumped to '{output_file_path}'.") + except Exception as e: + print(f"Error dumping all trials to '{dump_format}': {e}") + + +def dump_incumbents( + csv_config_data_path: Path, + summary_csv_dir: Path, + dump_format: str, + incumbents_ids: List[str], +) -> None: + """Dump incumbent trials to the specified format.""" + dump_format = dump_format.lower() + supported_formats = ["csv", "json", "parquet"] + if dump_format not in supported_formats: + print( + f"Unsupported dump format: '{dump_format}'. Supported formats are: {supported_formats}." + ) + return + + base_name = "incumbents" # Name for incumbents file + + if not incumbents_ids: + print("No incumbent trials found to dump.") + return + + try: + # Read the existing CSV into DataFrame + df = pd.read_csv(csv_config_data_path) + + # Filter DataFrame for incumbent IDs + df_incumbents = df[df["config_id"].isin(incumbents_ids)] + + if df_incumbents.empty: + print("No incumbent trials found in the summary CSV.") + return + + # Define output file path with desired extension + output_file_name = f"{base_name}.{dump_format}" + output_file_path = summary_csv_dir / output_file_name + + # Save to the desired format + if dump_format == "csv": + df_incumbents.to_csv(output_file_path, index=False) + elif dump_format == "json": + df_incumbents.to_json(output_file_path, orient="records", indent=4) + elif dump_format == "parquet": + df_incumbents.to_parquet(output_file_path, index=False) + + print(f"Incumbent trials successfully dumped to '{output_file_path}'.") + except Exception as e: + print(f"Error dumping incumbents to '{dump_format}': {e}") + + +def display_results(directory_path: Path, incumbents: List[Trial]) -> None: + """Display the results of the NePS run.""" + print(f"Results for NePS run: {directory_path}") + print("--------------------") + print("All Incumbent Trials:") + header = f"{'ID':<6} {'Loss':<12} {'Config':<60}" + print(header) + print("-" * len(header)) + if incumbents: + for trial in incumbents: + if trial.report is not None and trial.report.loss is not None: + config = ", ".join(f"{k}: {v}" for k, v in trial.config.items()) + print(f"{trial.id:<6} {trial.report.loss:<12.6f} {config:<60}") + else: + print(f"Trial {trial.id} has no valid loss.") + else: + print("No Incumbent Trials found.") + + def plot_incumbents( all_trials: List[Trial], incumbents: List[Trial], directory_path: Path ) -> str: @@ -517,61 +846,6 @@ def plot_incumbents( return plot_path -def results(args: argparse.Namespace) -> None: - """Handles the 'results' command by displaying incumbents in - reverse order and - optionally plotting and saving the results.""" - directory_path = get_root_directory(args) - - if not directory_path.exists() or not directory_path.is_dir(): - print( - f"Error: The directory {directory_path} does not exist or is not a " - f"directory." - ) - return - - try: - neps_state = load_filebased_neps_state(directory_path) - except VersionedResourceDoesNotExistsError: - print(f"No NePS state found in the directory {directory_path}.") - return - - trials = neps_state.get_all_trials() - # Sort trials by trial ID - sorted_trials = sorted(trials.values(), key=lambda x: int(x.id)) - - # Compute incumbents - best_loss = float("inf") - incumbents = [] - for trial in sorted_trials: - if trial.report and trial.report.loss < best_loss: - best_loss = trial.report.loss - incumbents.append(trial) - - # Reverse the list for displaying, so the most recent incumbent is shown first - incumbents_display = incumbents[::-1] - - if not args.plot: - print(f"Results for NePS run: {directory_path}") - print("--------------------") - print("All Incumbent Trials:") - header = f"{'ID':<6} {'Loss':<12} {'Config':<60}" - print(header) - print("-" * len(header)) - if len(incumbents_display) > 0: - for trial in incumbents_display: - if trial.report is not None and trial.report.loss is not None: - config = ", ".join(f"{k}: {v}" for k, v in trial.config.items()) - print(f"{trial.id:<6} {trial.report.loss:<12.6f} {config:<60}") - else: - print(f"Trial {trial.id} has no valid loss.") - else: - print("No Incumbent Trials found.") - else: - plot_path = plot_incumbents(sorted_trials, incumbents, directory_path) - print(f"Plot saved to {plot_path}") - - def print_help(args: Optional[argparse.Namespace] = None) -> None: """Prints help information for the NEPS CLI.""" help_text = """ @@ -586,7 +860,7 @@ def print_help(args: Optional[argparse.Namespace] = None) -> None: --config-path (Optional: Specify the path for the config file. Default is run_config.yaml) --template [basic|complete] (Optional: Choose between a basic or complete template.) - --state-machine (Optional: Creates a NEPS state. Requires an existing config.yaml.) + --database (Optional: Creates a NEPS state. Requires an existing config.yaml.) neps run [OPTIONS] Runs a neural pipeline search. @@ -726,15 +1000,147 @@ def generate_markdown_from_parser(parser: argparse.ArgumentParser, filename: str f.write("\n".join(lines)) +def handle_report_config(args: argparse.Namespace) -> None: + """Handles the report-config command which updates reports for + trials in the NePS state.""" + # Load run_args from the provided path or default to run_config.yaml + if args.run_args: + run_args_path = Path(args.run_args) + else: + run_args_path = Path("run_config.yaml") + if not run_args_path.exists(): + print(f"Error: run_args file {run_args_path} does not exist.") + return + + run_args = get_run_args_from_yaml(run_args_path) + + # Get root_directory from run_args + root_directory = run_args.get("root_directory") + if not root_directory: + print("Error: 'root_directory' is not specified in the run_args file.") + return + + root_directory = Path(root_directory) + if not root_directory.exists(): + print(f"Error: The directory {root_directory} does not exist.") + return + + neps_state = load_neps_state(root_directory) + if neps_state is None: + return + + # Load the existing trial by ID + try: + trial = neps_state.get_trial_by_id(args.trial_id) + if not trial: + print(f"No trial found with ID {args.trial_id}") + return + except Exception as e: + print(f"Error fetching trial with ID {args.trial_id}: {e}") + return None + + # Update state of the trial and create report + report = trial.set_complete( + report_as=args.reported_as, + time_end=args.time_end, + loss=args.loss, + cost=args.cost, + learning_curve=args.learning_curve, + err=Exception(args.err) if args.err else None, + tb=args.tb, + evaluation_duration=args.duration, + extra={}, + ) + + # Update NePS state + try: + neps_state.report_trial_evaluation( + trial=trial, report=report, worker_id=args.worker_id + ) + except Exception as e: + print(f"Error updating the report for trial {args.trial_id}: {e}") + return None + + print(f"Report for trial ID {trial.metadata.id} has been successfully updated.") + + print("\n--- Report Summary ---") + print(f"Trial ID: {trial.metadata.id}") + print(f"Reported As: {report.reported_as}") + print(f"Time Ended: {convert_timestamp(trial.metadata.time_end)}") + print(f"Loss: {report.loss if report.loss is not None else 'N/A'}") + print(f"Cost: {report.cost if report.cost is not None else 'N/A'}") + print(f"Evaluation Duration: {format_duration(report.evaluation_duration)}") + + if report.learning_curve: + print(f"Learning Curve: {' '.join(map(str, report.learning_curve))}") + else: + print("Learning Curve: N/A") + + if report.err: + print(f"Error Type: {type(report.err).__name__}") + print(f"Error Message: {str(report.err)}") + print("Traceback:") + print(report.tb if report.tb else "N/A") + else: + print("Error: None") + + print("----------------------\n") + + +def load_optimizer(run_args: dict) -> Tuple[Optional[BaseOptimizer], Optional[dict]]: + """Create an optimizer""" + try: + searcher_info = { + "searcher_name": "", + "searcher_alg": "", + "searcher_selection": "", + "neps_decision_tree": True, + "searcher_args": {}, + } + + # Call _run_args() to create the optimizer + optimizer, searcher_info = _run_args( + searcher_info=searcher_info, + pipeline_space=run_args.get(PIPELINE_SPACE), + max_cost_total=run_args.get(MAX_COST_TOTAL, None), + ignore_errors=run_args.get(IGNORE_ERROR, False), + loss_value_on_error=run_args.get(LOSS_VALUE_ON_ERROR, None), + cost_value_on_error=run_args.get(COST_VALUE_ON_ERROR, None), + searcher=run_args.get(SEARCHER, "default"), + **run_args.get(SEARCHER_KWARGS, {}), + ) + return optimizer, searcher_info + except Exception as e: + print(f"Error creating optimizer: {e}") + return None, None + + +def parse_time_end(time_str: str) -> float: + """Parses a UNIX timestamp or a human-readable time string + and returns a UNIX timestamp.""" + try: + # First, try to interpret the input as a UNIX timestamp + return float(time_str) + except ValueError: + pass + + try: + # If that fails, try to interpret it as a human-readable datetime + # string (YYYY-MM-DD HH:MM:SS) + dt = datetime.strptime(time_str, "%Y-%m-%d %H:%M:%S") + return dt.timestamp() # Convert to UNIX timestamp (float) + except ValueError: + raise argparse.ArgumentTypeError( + f"Invalid time format: '{time_str}'. " + f"Use UNIX timestamp or 'YYYY-MM-DD HH:MM:SS'." + ) + + def main() -> None: """CLI entry point. This function sets up the command-line interface (CLI) for NePS using argparse. It defines the available subcommands and their respective arguments. - - Available commands: - - init: Generates a 'run_args' YAML template file. - - run: Runs the optimization with specified configuration. """ parser = argparse.ArgumentParser(description="NePS Command Line Interface") subparsers = parser.add_subparsers( @@ -759,7 +1165,7 @@ def main() -> None: "all neps configs (complete)", ) parser_init.add_argument( - "--state-machine", + "--database", action="store_true", help="If set, creates a NEPS state. Requires an existing config.yaml.", ) @@ -928,16 +1334,72 @@ def main() -> None: # Subparser for "sample-config" command parser_sample_config = subparsers.add_parser( - "sample-config", help="Sample a configuration from existing neps state." + "sample-config", help="Sample configurations from the existing NePS state." ) parser_sample_config.add_argument( - "--root-directory", + "--worker-id", type=str, - help="Optional: The path to your root_directory. If not provided, " - "it will be loaded from run_config.yaml.", + default="cli", + help="The worker ID for which the configuration is being sampled.", + ) + parser_sample_config.add_argument( + "--run-args", + type=str, + help="Optional: Path to the YAML configuration file.", + ) + parser_sample_config.add_argument( + "--number-of-configs", + type=int, + default=1, + help="Optional: Number of configurations to sample (default: 1).", ) parser_sample_config.set_defaults(func=sample_config) + report_parser = subparsers.add_parser( + "report-config", help="Report of a specific trial" + ) + report_parser.add_argument("trial_id", type=str, help="ID of the trial to report") + report_parser.add_argument( + "reported_as", + type=str, + choices=["success", "failed", "crashed"], + help="Outcome of the trial", + ) + report_parser.add_argument( + "--worker-id", + type=str, + default="cli", + help="The worker ID for which the configuration is being sampled.", + ) + report_parser.add_argument("--loss", type=float, help="Loss value of the trial") + report_parser.add_argument( + "--run-args", type=str, help="Path to the YAML file containing run configurations" + ) + report_parser.add_argument( + "--cost", type=float, help="Cost value of the trial (optional)" + ) + report_parser.add_argument( + "--learning-curve", + type=float, + nargs="+", + help="Learning curve as a list of floats (optional), provided like this " + "--learning-curve 0.9 0.3 0.1", + ) + report_parser.add_argument( + "--duration", type=float, help="Duration of the evaluation in sec (optional)" + ) + report_parser.add_argument("--err", type=str, help="Error message if any (optional)") + report_parser.add_argument( + "--tb", type=str, help="Traceback information if any (optional)" + ) + report_parser.add_argument( + "--time-end", + type=parse_time_end, # Using the custom parser function + help="The time the trial ended as either a " + "UNIX timestamp (float) or in 'YYYY-MM-DD HH:MM:SS' format", + ) + report_parser.set_defaults(func=handle_report_config) + # Subparser for "status" command parser_status = subparsers.add_parser( "status", help="Check the status of the NePS run." @@ -972,6 +1434,23 @@ def main() -> None: parser_results.add_argument( "--plot", action="store_true", help="Plot the results if set." ) + + # Create a mutually exclusive group for dump options + dump_group = parser_results.add_mutually_exclusive_group() + dump_group.add_argument( + "--dump-all-configs", + type=str, + choices=["csv", "json", "parquet"], + help="Dump all trials to a file in the specified format (csv, json, parquet).", + ) + dump_group.add_argument( + "--dump-incumbents", + type=str, + choices=["csv", "json", "parquet"], + help="Dump incumbent trials to a file in the specified format " + "(csv, json, parquet).", + ) + parser_results.set_defaults(func=results) # Subparser for "help" command