Skip to content

EvidentlyAIDataProfiler

The EvidentlyAIDataProfiler class is a Python class that allows for profiling and comparing data using the EvidentlyAI library. It is a subclass of the BaseDataProfiler class.

Configuration

Required Configuration

The EvidentlyAI data profiler requires the following configuration:

  • local_dir: Location of a local directory to output files generated by this component.

Optional Configuration

The EvidentlyAI data profiler has no optional configuration.

Default Configuration

The EvidentlyAI data profiler uses the following optional configuration:

  • evidentlyai_profile_report_name: The file name of the generated report. Defaults to EVIDENTLYAI_DATA_PROFILE_REPORT.HTML.
  • evidentlyai_comparison_report_name: The file name of the generated report. Defaults to EVIDENTLYAI_DATA_COMPARISON_REPORT.HTML.

Methods

profile_data

Profiles data using EvidentlyAI.

profile_data(data, *args, **kwargs)

Parameters

  • data (pd.DataFrame): A pandas DataFrame containing the data to be profiled.

Returns

  • data_report (object): A Python object representing the generated report.
  • file_path (string): The file path of the exported report.

compare_data

Produces a data drift report between two data sets using EvidentlyAI.

compare_data(data, prev_data, *args, **kwargs)

Parameters

  • data (pd.DataFrame): A pandas DataFrame containing the "current" data.
  • prev_data (pd.DataFrame): A pandas DataFrame containing the "historical" data.

Returns

  • data_report (object): A Python object representing the generated report.
  • file_path (string): The file path of the exported report.

Usage

Here is an example of how to use the EvidentlyAIDataProfiler class:

import pandas as pd
from utils import error_handler, log_execution
from evidently import Report, DataQualityPreset, DataDriftPreset
from EvidentlyAIDataProfiler import EvidentlyAIDataProfiler

# Create an instance of the EvidentlyAIDataProfiler class
profiler = EvidentlyAIDataProfiler()

# Assuming you have data stored in a pandas DataFrame
data = pd.DataFrame(...)

# Profile the data using EvidentlyAI
data_report, file_path = profiler.profile_data(data)

# Compare the data to a previous version
prev_data = pd.DataFrame(...)
data_report, file_path = profiler.compare_data(data, prev_data)