SweetvizDataProfiler¶
The SweetvizDataProfiler class is used for generating data profiling reports and data comparison reports using the Sweetviz library.
Configuration¶
Required Configuration¶
The Sweetviz data profiler requires the following configuration:
local_dir: Location of a local directory to output files generated by this component.
Optional Configuration¶
The Sweetviz data profiler uses the following optional configuration.
model_target: The target feature in the data.
Default Configuration¶
The Sweetviz data profiler uses the following optional configuration:
SWEETVIZ_PROFILE_REPORT_NAME: The name of the profile report file. Default is "SWEETVIZ_DATA_PROFILE_REPORT.HTML".SWEETVIZ_COMPARISON_REPORT_NAME: The name of the comparison report file. Default is "SWEETVIZ_DATA_COMPARISON_REPORT.HTML".
Methods¶
profile_data¶
Profiles data using Sweetviz.
profile_data(data, *args, **kwargs)
Arguments:
data(pd.DataFrame): A dataframe of the data to profile.
Returns
data_report(object): Python object of the report.file_path(string): File path of the exported report.
Example
from lolpop.component import SweetvizDataProfiler
config = {
#insert component configuration here
}
profiler = SweetvizDataProfiler(conf=config)
data = pd.read_csv("data.csv")
report, path = profiler.profile_data(data)
compare_data¶
Produces a data drift report between two data sets using Sweetviz.
compare_data(data, prev_data, *args, **kwargs)
Arguments
data(pd.DataFrame): A dataframe of the "current" data.prev_data(pd.DataFrame): A dataframe of the "historical" data.
Returns
data_report(object): Python object of the report.file_path(string): File path of the exported report.
Example
from lolpop.component import SweetvizDataProfiler
config = {
#insert component configuration here
}
profiler = SweetvizDataProfiler(conf=config )
current_data = pd.read_csv("current_data.csv")
previous_data = pd.read_csv("previous_data.csv")
report, path = profiler.compare_data(current_data, previous_data)