The `mle-hyperopt` Package

Hyperparameter optimization made easy 🚀

The mle-hyperopt package provides a simple and intuitive API for hyperparameter optimization of your Machine Learning Experiment (MLE) pipeline. It supports real, integer & categorical search variables and single- or multi-objective optimization.

Core features include the following:

API Simplicity: strategy.ask(), strategy.tell() interface & space definition.
Strategy Diversity: Grid, random, coordinate search, SMBO & wrapping FAIR's nevergrad, Successive Halving, Hyperband, Population-Based Training.
Search Space Refinement based on the top performing configs via strategy.refine(top_k=10).
Export of configurations to execute via e.g. python train.py --config_fname config.yaml.
Storage & reload search logs via strategy.save(<log_fname>), strategy.load(<log_fname>).

For a quickstart check out the notebook blog 📖.

The API 🎮

from mle_hyperopt import RandomSearch

# Instantiate random search class
strategy = RandomSearch(real={"lrate": {"begin": 0.1,
                                        "end": 0.5,
                                        "prior": "log-uniform"}},
                        integer={"batch_size": {"begin": 32,
                                                "end": 128,
                                                "prior": "uniform"}},
                        categorical={"arch": ["mlp", "cnn"]})

# Simple ask - eval - tell API
configs = strategy.ask(5)
values = [train_network(**c) for c in configs]
strategy.tell(configs, values)

Implemented Search Types 🔭

Search Type	Description	`search_config`
`GridSearch`	Search over list of discrete values	-
`RandomSearch`	Random search over variable ranges	`refine_after`, `refine_top_k`
`CoordinateSearch`	Coordinate-wise optimization with fixed defaults	`order`, `defaults`
`SMBOSearch`	Sequential model-based optimization (Hutter et al., 2011)	`base_estimator`, `acq_function`, `n_initial_points`
`NevergradSearch`	Multi-objective nevergrad wrapper	`optimizer`, `budget_size`, `num_workers`
`HalvingSearch`	Successive Halving (Karmin et al., 2013)	`min_budget`, `num_arms`, `halving_coeff`
`HyperbandSearch`	Hyperband (Li et al., 2018)	`max_resource`, `eta`
`PBTSearch`	Population-Based Training (Jaderberg et al., 2017)	`explore`, `exploit`

Variable Types & Hyperparameter Spaces 🌍

Variable	Type	Space Specification
`real`	Real-valued	`Dict`: `begin`, `end`, `prior`/`bins` (grid)
`integer`	Integer-valued	`Dict`: `begin`, `end`, `prior`/`bins` (grid)
`categorical`	Categorical	`List`: Values to search over

Installation ⏳

A PyPI installation is available via:

pip install mle-hyperopt

Alternatively, you can clone this repository and afterwards 'manually' install it:

git clone https://github.com/mle-infrastructure/mle-hyperopt.git
cd mle-hyperopt
pip install -e .

Search Method Highlights 🔎

Grid Search 🟥

strategy = GridSearch(
    real={"lrate": {"begin": 0.1,
                    "end": 0.5,
                    "bins": 5}},
    integer={"batch_size": {"begin": 1,
                            "end": 5,
                            "bins": 1}},
    categorical={"arch": ["mlp", "cnn"]},
    fixed_params={"momentum": 0.9})

configs = strategy.ask()

Hyperband 🎸

strategy = HyperbandSearch(
    real={"lrate": {"begin": 0.1,
                    "end": 0.5,
                    "prior": "uniform"}},
    integer={"batch_size": {"begin": 1,
                            "end": 5,
                            "prior": "log-uniform"}},
    categorical={"arch": ["mlp", "cnn"]},
    search_config={"max_resource": 81,
                   "eta": 3},
    seed_id=42,
    verbose=True)

configs = strategy.ask()

Population-Based Training 🦎

strategy = PBTSearch(
    real={"lrate": {"begin": 0.1,
                    "end": 0.5,
                    "prior": "uniform"}}
    search_config={
        "exploit": {"strategy": "truncation", "selection_percent": 0.2},
        "explore": {"strategy": "perturbation", "perturb_coeffs": [0.8, 1.2]},
        "steps_until_ready": 4,
        "num_workers": 10,
    },
    maximize_objective=True
)

configs = strategy.ask()

Further Options 🚴

Saving & Reloading Logs 🏪

# Storing & reloading of results from .json/.yaml/.pkl
strategy.save("search_log.json")
strategy = RandomSearch(..., reload_path="search_log.json")

# Or manually add info after class instantiation
strategy = RandomSearch(...)
strategy.load("search_log.json")

Search Decorator 🧶

from mle_hyperopt import hyperopt

@hyperopt(strategy_type="Grid",
          num_search_iters=25,
          real={"x": {"begin": 0., "end": 0.5, "bins": 5},
                "y": {"begin": 0, "end": 0.5, "bins": 5}})
def circle(config):
    distance = abs((config["x"] ** 2 + config["y"] ** 2))
    return distance

strategy = circle()

Storing Configuration Files 📑

# Store 2 proposed configurations - eval_0.yaml, eval_1.yaml
strategy.ask(2, store=True)
# Store with explicit configuration filenames - conf_0.yaml, conf_1.yaml
strategy.ask(2, store=True, config_fnames=["conf_0.yaml", "conf_1.yaml"])

Storing Checkpoint Paths 🛥️

# Ask for 5 configurations to evaluate and get their scores
configs = strategy.ask(5)
values = ...
# Get list of checkpoint paths corresponding to config runs
ckpts = [f"ckpt_{i}.pt" for i in range(len(configs))]
# `tell` parameter configs, eval scores & ckpt paths
# Required for Halving, Hyperband and PBT
strategy.tell(configs, scores, ckpts)

Retrieving Top Performers & Visualizing Results 📉

# Get the top k best performing configurations
id, configs, values = strategy.get_best(top_k=4)

# Plot timeseries of best performing score over search iterations
strategy.plot_best()

# Print out ranking of best performers
strategy.print_ranking(top_k=3)

Refining the Search Space of Your Strategy 🪓

# Refine the search space after 5 & 10 iterations based on top 2 configurations
strategy = RandomSearch(real={"lrate": {"begin": 0.1,
                                        "end": 0.5,
                                        "prior": "log-uniform"}},
                        integer={"batch_size": {"begin": 1,
                                                "end": 5,
                                                "prior": "uniform"}},
                        categorical={"arch": ["mlp", "cnn"]},
                        search_config={"refine_after": [5, 10],
                                       "refine_top_k": 2})

# Or do so manually using `refine` method
strategy.tell(...)
strategy.refine(top_k=2)

Note that the search space refinement is only implemented for random, SMBO and nevergrad-based search strategies.

The mle-hyperopt Package