OptunaHyperparameterTuner¶
The OptunaHyperparameterTuner class is a subclass of BaseHyperparameterTuner, used for optimizing model hyperparameters using the Python library Optuna.
Configuration¶
Required Configuration¶
In addition to the required configuration in the BaseHyperparameterTuner, OptunaHyperparamterTuner also requires the following:
local_dir: A local directory to use during operation.
Default Configuration¶
OptunaHyperparamterTuner uses the following default configuration:
param_type: The parameter type used in thetraining_paramsconfiguration. Default is "fixed". "dynamic" is also accepted.optuna_timeout: The maximum amount of time to run experiments before gracefully exiting. Default is 3600 (i.e. one hour).num_jobs: The number of concurrent experiments to run in Optuna. Defaults to 1.num_trials: The maximum number of trials to run. Only used ifparam_typeis "dynamic". otherwise,num_trailsis equal to the length of the parameter grid. Deafaults to 100.
Methods¶
run_experiment¶
This method is used to optimize the algorithm hyperparameters using Optuna. This runs an Optuna study and will select the best experiment (i.e. Optuna trial) created during the study run time.
run_experiment(data, model_version, *args, **kwargs)
data(object): represents the dataset to train the model,model_version(object): represents the object version from the metadata tracker,
Returns:
-The best generated model as an object.
_optuna_objective¶
This is the objective function for Optuna optimization. This should typically not be overridden.
def _optuna_objective(trial, param_type, data, model_version, algo, params, trainer_config, metrics, perf_metric, experiment_list, model_list)
trial(object): TheOptunatrial object.param_type(str): Theparam_typeused, either "fixed" or "dynamic".data(object): The data to use for model training and evaluation.model_version(object): The model version object obtained from themetadata_tracker.algo(str): The name of themodel_trainerclass to use for this experiment.params(dict): A dictionary of key-value pairs to use for model configuration.trainer_config(dict): A dictionary of key-value pairs to use for themodel_trainerconfigurationmetrics(list): A list of metrics to calculate.perf_metric(str): The metric to use to determine the winning experiment.exp_list(dict): A dictionary of completed experiments and theirperf_metricvalue. This will be added to with the current experiment.model_list(dict): A dictionary of completed experiments and their model objects. This will be added to with the current experiment.
Returns:
- The value of the specified performance metric for the trained model.
_parse_dynamic_logic¶
This method is used to parse each value of a dynamic configuration to the proper Optuna type.
def _parse_dynamic_logic(self, trial, name, p)
Arguments:
trial(object): theOptunatrial object.name(str): name of the configuration parameterp(dict): dictionary containing the type, and value/range/choices of the parameter
Returns:
- Returns the parsed value in Optuna form. I.E. will return optuna.suggest_X() where X is based on the input type.
_get_dynamic_params¶
This method retrieves the parameters corresponding to the dynamic type of parameter tuning.
python
def _get_dynamic_params(self, trial, params)
Arguments:
trial(object): TheOptunatrial object.params(dict): A dictionary of parameter values.
Returns:
- A dictionary containing the generated parameters. (Generated via _parse_dynamic_logic)
_get_fixed_params¶
This method retrieves the parameters corresponding to the fixed type of parameter tuning.
def _get_fixed_params(self, trial, params)
Arguments:
trial(object): TheOptunatrial object.params(dict): A dictionary of parameter values.
Returns:
A dictionary containing the generated parameters.
_log_trial¶
This method is used to log the trial run, the generated model, and the parameters.
def _log_trial(self, trial, model_params, model, experiment, algo)
trial object, a dictionary of generated parameters, the generated model, the experiment used to train the model, and the name of the algorithm being used for training. The method returns the experiment used to train the model.
Arguments:
trial(object): TheOptunatrial object.model_params(dict): A dictionary of parameter values.model(object): The model object.experiment(object): The experiment object obtained from themetadata_tracker.algo(str): The name of themodel_trainerclass to use for this experiment.
Returns: - The (possibly updated) experiment object.
_log_study¶
This method is used to save interesting parts of the study containing the optimization history, which can be referred to later.
def _log_study(self, study, model_version, algo)
Arguments:
- study (object): The Optuna study object.
- model_version (object): The model version object to log the study into.
- algo (str): The name of the model_trainer class used for this experiment.
_save_plot¶
This method is used to save an HTML plot containing the Optuna optimization history to disk.
_save_plot(self, name, plot, model_version, algo)
Arguments:
name(str): A name for the saved HTML fileplot(object): TheOptunaplotmodel_version(object): The model version object to save the plot into, obtained from themetadata_trackeralgo(str): The name of themodel_trainerclass used for this experiment.
Returns:
An instance of the Artifact class corresponding to the saved plot.
Usage¶
from lolpop.component import OptunaHyperparameterTuner, LocalDataSplitter, MLFlowMetadataTracker
import pandas as pd
#get data
data_splitter_config = {
#insert component config
}
data_splitter = LocalDataSplitter(conf=data_splitter_config)
df = pd.read_csv("/path/to/data.csv")
data = data_spliter.split_data(df)
#get model_version
metadata_tracker_config = {
#insert component config here
}
metadata_tracker = MLFlowMetadataTracker(conf=config)
model_version = metadata_tracker.create_resource(id, type="model_version")
config = {
#insert component config here
}
# create an instance of LocalHyperparameterTuner
oht = OptunaHyperparameterTuner(conf=config)
# run the experiment
best_model = oht.run_experiment(data, model_version)
Sample training parameters¶
Fixed¶
hyperparameter_tuner:
config:
param_type: fixed
training_params:
XGBoostModelTrainer:
objective: ["multiclass"]
max_depth: [4,8]
alpha: [1,5,10]
learning_rate: [1.0]
n_estimators: [10]
random_state: [0]
Dynamic¶
hyperparameter_tuner:
config:
param_type: dynamic
training_params:
XGBoostModelTrainer:
objective: {"type": "categorical", "choices" : ["multiclass"]}
max_depth: {"type": "int", "range" : [4,8]}
alpha: {"type": "float", "range" : [1,10]}
learning_rate: {"type": "float", "range" : [0.1,1]}
n_estimators: {"type": "int", "range" : [1,10]}
random_state: {"type": "fixed", "value" : 0}