Load Documentation¶

ecopipeline.load.check_table_exists(cursor: MySQLCursor, table_name: str, dbname: str) → int¶

Check if the given table name already exists in database.

Parameters:

cursormysql.connector.cursor.MySQLCursor: Database cursor object and the table name.
table_namestr: Name of the table
dbnamestr: Name of the database

Returns:

int:: The number of tables in the database with the given table name. This can directly be used as a boolean!

ecopipeline.load.create_new_table(cursor: MySQLCursor, table_name: str, table_column_names: list, table_column_types: list, primary_key: str = 'time_pt', has_primary_key: bool = True) → bool¶

Creates a new table in the mySQL database.

Parameters:

cursormysql.connector.cursor.MySQLCursor: A cursor object and the name of the table to be created.
table_namestr: Name of the table
table_column_nameslist: list of columns names in the table must be passed.
primary_key: str: The name of the primary index of the table. Should be a datetime. If has_primary_key is set to False, this will just be a column not a key.
has_primary_keybool: Set to False if the table should not establish a primary key. Defaults to True

Returns:

bool:: A boolean value indicating if a table was sucessfully created.

ecopipeline.load.load_data_statistics(config: ConfigManager, daily_stats_df: DataFrame, config_daily_indicator: str = 'day', custom_table_name: str = None)¶

Logs data statistics for the site in a table with name “{daily table name}_stats”

Parameters:

configecopipeline.ConfigManager: The ConfigManager object that holds configuration data for the pipeline.
daily_stats_dfpd.DataFrame: dataframe created by the create_data_statistics_df() function in ecopipeline.transform
config_daily_indicatorstr: the indicator of the daily_table name in the config.ini file of the data pipeline
custom_table_namestr: custom table name for data statistics. Overwrites the name “{daily table name}_stats” to your custom name. In this sense config_daily_indicator’s pointer is no longer used.

Returns:

bool:: A boolean value indicating if the data was successfully written to the database.

ecopipeline.load.load_event_table(config: ConfigManager, event_df: DataFrame, site_name: str = None)¶

Loads given pandas DataFrame into a MySQL table overwriting any conflicting data. Uses an UPSERT strategy to ensure any gaps in data are filled. Note: will not overwrite values with NULL. Must have a new value to overwrite existing values in database

Parameters:

configecopipeline.ConfigManager: The ConfigManager object that holds configuration data for the pipeline.
event_df: pd.DataFrame: The pandas DataFrame to be written into the mySQL server. Must have columns event_type and event_detail
site_namestr: the name of the site to correspond the events with. If left blank will default to minute table name

Returns:

bool:: A boolean value indicating if the data was successfully written to the database.

ecopipeline.load.load_overwrite_database(config: ConfigManager, dataframe: DataFrame, config_info: dict, data_type: str, primary_key: str = 'time_pt', table_name: str = None, auto_log_data_loss: bool = False, config_key: str = 'minute')¶

Loads given pandas DataFrame into a MySQL table overwriting any conflicting data. Uses an UPSERT strategy to ensure any gaps in data are filled. Note: will not overwrite values with NULL. Must have a new value to overwrite existing values in database

Parameters:

configecopipeline.ConfigManager: The ConfigManager object that holds configuration data for the pipeline.
dataframe: pd.DataFrame: The pandas DataFrame to be written into the mySQL server.
config_info: dict: The dictionary containing the configuration information in the data upload. This can be aquired through the get_login_info() function in this package
data_type: str: The header name corresponding to the table you wish to write data to.
primary_keystr: The name of the primary key in the database to upload to. Default as ‘time_pt’
table_namestr: overwrites table name from config_info if needed
auto_log_data_lossbool: if set to True, a data loss event will be reported if no data exits in the dataframe for the last two days from the current date OR if an error occurs
config_keystr: The key in the config.ini file that points to the minute table data for the site. The name of this table is also the site name.

Returns:

bool:: A boolean value indicating if the data was successfully written to the database.

ecopipeline.load.report_data_loss(config: ConfigManager, site_name: str = None)¶

Logs data loss event in event database (assumes one exists) as a DATA_LOSS_COP event to note that COP calculations have been effected

Parameters:

configecopipeline.ConfigManager: The ConfigManager object that holds configuration data for the pipeline.
site_namestr: the name of the site to correspond the events with. If left blank will default to minute table name

Returns:

bool:: A boolean value indicating if the data was successfully written to the database.

Load Documentation¶

Table of Contents

Previous topic

Next topic

This Page