BaseCacheManager
Overview¶
A cache_manger
is a component that is able to cache inputs and outputs to class methods. This functionality is intended for use in development workflows to iterate quickly on troublesome areas of the workflow so that parts of the pipeline can be skipped if the inputs have not changed. In these cases, a cache manager will be able to retrieve results from the cache instead of executing the method again.
The BaseCacheManager
class provides an interface and some default methods that child classes can utilize to implement method caching. In particular, child classes should mainly be looking to implement cache
and retrieve
methods in order ot determine how to save and load data, and possibly also implement a more robust equals
, depending on their use case.
Also be sure to check out Using Decorators to learn more about using decorators in your own workflows.
Note
The cache_manager
is considered experimental and is not recommended for production use. It is only recommended to use to help accelerate debugging development workflows.
Attributes¶
BaseCacheManager
contains no default attributes.
Configuration¶
Required Configuration¶
BaseHyperparameterTuner
contains no required configuration or components.
Optional Configuration¶
BaseCacheManager
has the following optional configuration:
integration_class
: A list of class names to decorate with the cachedecorator_method
. You can use this to explicitly cache certain classes in your workflow. Note that this overridesintegration_types
, but is also overridden by a class's ownno_cache
configuration.
Default Configuration¶
BaseCacheManager
uses the following default configuration:
-
decorator_method
: The method to use to decorate class methods so that they are cached after execution. Defaults tocache_decorator
. This should not be changed unless you really know what you are doing -
integration_types
: The integration type(s) to decorate with the cachedecorator_method
. This defaults to["component"]
and it is only recommended to cache at the component level.
Interface¶
The following methods are part of BaseCacheManager
and should be implemented in any class that inherits from this base class:
cache¶
This method takes a key value pair and caches the value using the specified key. You can then retrieve the value from the cache given the key by using retrieve
def cache(self, key, value, *args, **kwargs) -> str
Arguments:
key
(str): The key to use to cache the datavalue
(object): The object to cache.
Returns:
str
: This is intended to be the location of the object in the cache itself. I.E. a folder location or bucket location, etc.
retrieve¶
This method retrieves a cached object from the cache given the provied key.
def retrieve(self, key, *args, **kwargs) -> Any
Arguments:
key
(str): The key of the cached object.
Returns:
object
: This is an object that was stored in the cache viacache
Default Methods¶
The following are default methods that are implemented in the base class. They can be overridden by inheriting classes as needed.
equals¶
This method compares two objects to determine if they are equal. This utilizes lolpop.utils.common_utils.compare_objects
which is a "quick and dirty" method to compare objects. If you desire a more robust method you may wish to override this method.
def equals(self, objA, objB, *args, **kwargs) -> bool
Arguments:
objA
(object): The first object.objB
(object): The second object.
Returns:
bool
: whether the two objects are equal.
cache_decorator¶
This is a default decorator which can be used to cache objects. This should be sufficient for any class inheriting from BaseCacheManager
, given that they implement non-trivial cache
and retrieve
.
lolpop will decorate any classes in this decorate as specified in integration_types
or integration_classes
. Methods in wrapped classes will then operate as follows:
-
When the method is executed, the cache will be checked for all of the following:
- Input arguments
- Input keyword arguments
- The configuration of the integration
- The actual method code
-
Each of the above is compared to the currently executing method. If no differences are found then the output of the method is retrieved from the cache and returned.
-
If no previous output is found or if differences are found in the method inputs, configuration, or code itself, then lolpop will execute the method as normal and return the output.
-
If the method is executed then the method inputs, configuration, code, and output are all cached.
Note
cache_decorator
should be applied either via decorators
when defining your workflow [decorators][using_custom_dectorators.md] in your workflow yaml
file, or directly to integrations via the lolpop.utils.common_utils.decorate_all_methods
class decorator.
def cache_decorator(self, func, cls):
_stringify_input¶
Helper function that converts method inputs to strings. This is used to properly map all inputs of a method call into a string and to try to ensure that unique calls of the method also are mapped to unique values (in the case that a method might be called multiple times during a workflow).
This attempts to form a string by calling lolpop.utils.common_utils.convert_arg_to_string
on every argument, keyword argument, and configuration items of the calling method. The resulting string will also be hashed if it is longer than 200 characters.
def _stringify_input(self, obj, func, args, kwargs) -> str:
Arguments:
obj
(object): The integration object that is being used.func
(function): The method of the integration object that is being executedargs
(list): A list of arguments passed intofunc
kwargs
(dict): A dictionary of keyword arguments passed intofunc
Returns:
str
: A string representing this particular execution offunc
Usage¶
pipeline:
process: OfflineProcess
train: OfflineTrain
predict: OfflinePredict
component:
metadata_tracker: MLFlowMetadataTracker
notifier: StdOutNotifier
resource_version_control: dvcVersionControl
metrics_tracker: MLFlowMetricsTracker
decorator:
cache_manager: LocalCacheManager
...
cache_manager:
config:
decorator_method: cache_decorator
integration_types: ["component"]
...