BaseModelChecker
Overview¶
A model_checker
is a component that takes a model and runs checks on that model. These models checks are typically concerned with things like model error, overfitting, baseline comparisons, drift, etc.
BaseModelChecker
exposes two main methods for children to implement: check_model
and calculate_model_drift
. Additionally, it implements its own baseline comparison method.
Attributes¶
BaseModelChecker
contains no default attributes.
Configuration¶
BaseModelChecker
contains the following required components:
metrics_tracker
BaseModelChecker
contains the following required configuration:
baseline_method
: The baseline method to use for baseline comparisons. Possible values forbaseline_method
arecolumn
andvalue
.baseline_value
: The value of the baseline method to use. Ifbaseline_method
iscolumn
thenbaseline_value
should be the name of the column to use as the baseline. This scenario is designed for user-defined baselines. I.E. you can create whatever baseline you wish and put it as a separate column. Ifbaseline_method
isvalue
then the following are accepted forbaseline_value
:problem_type
:classification
class_avg
: randomly selects class based on the frequency they appear in the provided dataset.class_most_frequent
: selects the most prominent class
problem_type
:regression
avg
: selects the average valuemode
: selects the modemax
: selects the max valuemin
: selects the min valuemedian
: selects the median value
problem_type
:timeseries
last_value
: selects the previous valuelag_mean
: selects the mean over a windowlag_max
: selects the max over a windowlag_min
: selects the min over a windowlag_median
: selects the median over a window- for all
lag_
baseline_values
you can specify the window vialag_mean_X
, whereX
is the number of rows to use in the window. I.E.lag_mean_7
takes the mean over the last 7 values.
perf_metric
: The performance metric used to determine if
Interface¶
The following methods are part of BaseModelChecker
and should be implemented in any class that inherits from this base class:
check_model¶
Performs a model check on the given model.
def check_model(self, data, model, *args, **kwargs) -> tuple[Any, str, str]
Arguments:
data
(dict): A dictionary of train/test data.model
(object): The model to check.
Returns:
model_report
(Any): Python object of the model report.file_path
(string): Path to the exported report.checks_status
(string): Status of the checks ("PASS"/"WARN"/"ERROR", etc.)
calculate_model_drift¶
Performs a drift comparison between two models.
def calculate_model_drift(self, data, model, deployed_model, *args, **kwargs) -> tuple[Any, str]
Arguments:
data
(dict): A dictionary of train/test data.model
(object): The model to check.deployed_model
(object): The currently deployed model.
Returns:
model_report
(Any): Python object of the model report.file_path
(string): Path to the exported report.
Default Methods¶
The following are default methods that are implemented in the base class. They can be overridden by inheriting classes as needed.
get_baseline_comparison¶
Compare the model performance against a baseline.
def get_baseline_comparison(self, data, model, model_version, *args, **kwargs) -> tuple[bool, float]
Argumentss:
data
(dict): The data for model evaluation.model
(obj): The trained model object.model_version
(obj): The model version object, obtained from ametadata_tracker
.
Returns:
- (bool) Is the baseline better than the model?
- (float) The metric difference between the model and the baseline.
compare_models¶
Compare the performance of the current model against the previous model.
compare_models(self, data, model, prev_model, model_version, prev_model_version=None, *args, **kwargs) -> bool
data
(dict): The data for model evaluation.model
(object): The current trained model object.prev_model
(object): The previous trained model object.model_version
(object): The version of the current model.prev_model_version
(object): The version of the previous model (default None).
Returns:
bool
: True if the current model is better than the previous model, False otherwise.
_get_baseline_predictions¶
Get baseline predictions based on the baseline method and value.
def _get_baseline_predictions(self, data, baseline_method, baseline_value, *args, **kwargs) -> dict[str,Any]
data
(dict): The data for baseline comparison.baseline_method
(str): The method for baseline comparison.baseline_value
(str): The value used in baseline comparison.
Returns:
dict
: A dictionary containing the baseline predictions for different data splits.
_get_metric_comparison¶
Compares two metric values and determines which is better.
def _get_metric_comparison(self, champion_metric, challenger_metric, perf_metric, *args, **kwargs) -> tuple[bool, float]
champion_metric
(dict): champion metricchallenger_metric
(dict): challenger metricperf_metric
(str): The performance metric to use to determine which metric is best
Returns:
bool
: is the challenger metric better? True or Falsefloat
: difference between champion and challenger metric
_compare_model_performance¶
Compares model performance to deployed model against a static data set.
def _compare_model_performance(self, model, deployed_model, data, perf_metric, current_model_version, *args, **kwargs) -> tuple[bool, float]
Arguments:
model
(object): current modeldeployed_model
(object): deployed modeldata
(dictionary): dicitonary of training/test dataperf_metric
(str): performance metric to use to determine which model is bettercurrent_model_version
(object): current model verison to log metrics into
Returns:
bool
: Is the new model better? True or Falsefloat
: difference between metric values