LocalDataConnector¶
This is a Python class that inherits from BaseDataConnector class and provides methods to read and write CSV, parquet, or ORC file to local storage. The class has two main methods get_data() and save_data().
Configuration¶
Required Configuration¶
There is no required configuration.
Optional Configuration¶
There is not optional configuration.
Default Configuration¶
LocalDataConnector uses the following default configuration:
save_index: To save the dataframe index as part of the dataset. Defaults to True.
Methods:¶
get_data¶
This method reads data from the specified file path (CSV, Parquet, or ORC) into memory as a pandas DataFrame. The method returns the loaded data as a pandas DataFrame. The loaded file is read using the Pyarrow engine which supports the reading of these file formats.
def get_data(self, source_path, *args, **kwargs)
Arguments:
source_path(str): The path to the data source file.
Returns:
pandas.DataFrame: Returns data as a pandas DataFrame.
Example:
import pandas as pd
from lolpop.component import LocalDataConnector
config = {
#insert component config here
}
# Create an instance of LocalDataConnector
connector = LocalDataConnector(conf=config)
# Data file path
data_path = "/datafolder/sales/sales_data.csv"
# Load the data into a pandas dataframe
data = connector.get_data(data_path)
# Display the first 5 rows of the data
print(data.head(5))
save_data¶
This method writes data from a pandas DataFrame into a specified file path as CSV or Parquet format and returns the saved data.
def save_data(self, data, target_path, *args, **kwargs)
Arguments:
-
data(pandas.DataFrame): Data to be saved. -
target_path(str): The path to the target file.
Returns:
pandas.DataFrame: Returns saved data.
Example:
import pandas as pd
from lolpop.component import LocalDataConnector
config = {
#insert component config here
}
# Create an instance of LocalDataConnector
connector = LocalDataConnector(conf=config)
# Load sample data into pandas DataFrame
data = pd.DataFrame({
"product_name": ["Product A", "Product B", "Product C"],
"quantity_sold": [100, 200, 300],
"revenue": [2500, 5000, 7500]
})
# Define file path to save the data
save_path = "/data/sales/sales_data.csv"
# Save the data to the specified file path
connector.save_data(data, save_path)