The mle-hyperopt
Package
Hyperparameter optimization made easy ๐
The mle-hyperopt
package provides a simple and intuitive API for hyperparameter optimization of your Machine Learning Experiment (MLE) pipeline. It supports real, integer & categorical search variables and single- or multi-objective optimization.
Core features include the following:
- API Simplicity:
strategy.ask()
,strategy.tell()
interface & space definition. - Strategy Diversity: Grid, random, coordinate search, SMBO & wrapping FAIR's
nevergrad
, Successive Halving, Hyperband, Population-Based Training. - Search Space Refinement based on the top performing configs via
strategy.refine(top_k=10)
. - Export of configurations to execute via e.g.
python train.py --config_fname config.yaml
. - Storage & reload search logs via
strategy.save(<log_fname>)
,strategy.load(<log_fname>)
.
For a quickstart check out the notebook blog ๐.
The API ๐ฎ
from mle_hyperopt import RandomSearch
# Instantiate random search class
strategy = RandomSearch(real={"lrate": {"begin": 0.1,
"end": 0.5,
"prior": "log-uniform"}},
integer={"batch_size": {"begin": 32,
"end": 128,
"prior": "uniform"}},
categorical={"arch": ["mlp", "cnn"]})
# Simple ask - eval - tell API
configs = strategy.ask(5)
values = [train_network(**c) for c in configs]
strategy.tell(configs, values)
Implemented Search Types ๐ญ
Search Type | Description | search_config |
|
---|---|---|---|
GridSearch |
Search over list of discrete values | - | |
RandomSearch |
Random search over variable ranges | refine_after , refine_top_k |
|
CoordinateSearch |
Coordinate-wise optimization with fixed defaults | order , defaults |
|
SMBOSearch |
Sequential model-based optimization (Hutter et al., 2011) | base_estimator , acq_function , n_initial_points |
|
NevergradSearch |
Multi-objective nevergrad wrapper | optimizer , budget_size , num_workers |
|
HalvingSearch |
Successive Halving (Karmin et al., 2013) | min_budget , num_arms , halving_coeff |
|
HyperbandSearch |
Hyperband (Li et al., 2018) | max_resource , eta |
|
PBTSearch |
Population-Based Training (Jaderberg et al., 2017) | explore , exploit |
Variable Types & Hyperparameter Spaces ๐
Variable | Type | Space Specification | |
---|---|---|---|
real |
Real-valued | Dict : begin , end , prior /bins (grid) |
|
integer |
Integer-valued | Dict : begin , end , prior /bins (grid) |
|
categorical |
Categorical | List : Values to search over |
Installation โณ
A PyPI installation is available via:
pip install mle-hyperopt
Alternatively, you can clone this repository and afterwards 'manually' install it:
git clone https://github.com/mle-infrastructure/mle-hyperopt.git
cd mle-hyperopt
pip install -e .
Search Method Highlights ๐
Grid Search ๐ฅ
strategy = GridSearch(
real={"lrate": {"begin": 0.1,
"end": 0.5,
"bins": 5}},
integer={"batch_size": {"begin": 1,
"end": 5,
"bins": 1}},
categorical={"arch": ["mlp", "cnn"]},
fixed_params={"momentum": 0.9})
configs = strategy.ask()
Hyperband ๐ธ
strategy = HyperbandSearch(
real={"lrate": {"begin": 0.1,
"end": 0.5,
"prior": "uniform"}},
integer={"batch_size": {"begin": 1,
"end": 5,
"prior": "log-uniform"}},
categorical={"arch": ["mlp", "cnn"]},
search_config={"max_resource": 81,
"eta": 3},
seed_id=42,
verbose=True)
configs = strategy.ask()
Population-Based Training ๐ฆ
strategy = PBTSearch(
real={"lrate": {"begin": 0.1,
"end": 0.5,
"prior": "uniform"}}
search_config={
"exploit": {"strategy": "truncation", "selection_percent": 0.2},
"explore": {"strategy": "perturbation", "perturb_coeffs": [0.8, 1.2]},
"steps_until_ready": 4,
"num_workers": 10,
},
maximize_objective=True
)
configs = strategy.ask()
Further Options ๐ด
Saving & Reloading Logs ๐ช
# Storing & reloading of results from .json/.yaml/.pkl
strategy.save("search_log.json")
strategy = RandomSearch(..., reload_path="search_log.json")
# Or manually add info after class instantiation
strategy = RandomSearch(...)
strategy.load("search_log.json")
Search Decorator ๐งถ
from mle_hyperopt import hyperopt
@hyperopt(strategy_type="Grid",
num_search_iters=25,
real={"x": {"begin": 0., "end": 0.5, "bins": 5},
"y": {"begin": 0, "end": 0.5, "bins": 5}})
def circle(config):
distance = abs((config["x"] ** 2 + config["y"] ** 2))
return distance
strategy = circle()
Storing Configuration Files ๐
# Store 2 proposed configurations - eval_0.yaml, eval_1.yaml
strategy.ask(2, store=True)
# Store with explicit configuration filenames - conf_0.yaml, conf_1.yaml
strategy.ask(2, store=True, config_fnames=["conf_0.yaml", "conf_1.yaml"])
Storing Checkpoint Paths ๐ฅ๏ธ
# Ask for 5 configurations to evaluate and get their scores
configs = strategy.ask(5)
values = ...
# Get list of checkpoint paths corresponding to config runs
ckpts = [f"ckpt_{i}.pt" for i in range(len(configs))]
# `tell` parameter configs, eval scores & ckpt paths
# Required for Halving, Hyperband and PBT
strategy.tell(configs, scores, ckpts)
Retrieving Top Performers & Visualizing Results ๐
# Get the top k best performing configurations
id, configs, values = strategy.get_best(top_k=4)
# Plot timeseries of best performing score over search iterations
strategy.plot_best()
# Print out ranking of best performers
strategy.print_ranking(top_k=3)
Refining the Search Space of Your Strategy ๐ช
# Refine the search space after 5 & 10 iterations based on top 2 configurations
strategy = RandomSearch(real={"lrate": {"begin": 0.1,
"end": 0.5,
"prior": "log-uniform"}},
integer={"batch_size": {"begin": 1,
"end": 5,
"prior": "uniform"}},
categorical={"arch": ["mlp", "cnn"]},
search_config={"refine_after": [5, 10],
"refine_top_k": 2})
# Or do so manually using `refine` method
strategy.tell(...)
strategy.refine(top_k=2)
Note that the search space refinement is only implemented for random, SMBO and nevergrad-based search strategies.