Pipeline Configuration Documentation

class utils.ConfigManager.ConfigManager(config_file_path: str = 'config.ini', input_directory: str = None, output_directory: str = None, data_directory: str = None, eco_file_structure: bool = False)

Manage configuration, directory paths, database credentials, and API tokens for a data pipeline.

Reads a config.ini file to populate connection and directory settings. Optionally remaps Windows-style drive letters to POSIX paths when running on Ecotope’s server infrastructure.

Parameters:
config_file_pathstr, optional

Path to the config.ini file for the pipeline (e.g. "full/path/to/config.ini"). Must contain login information for the MySQL database where data is to be loaded. Defaults to "config.ini".

input_directorystr, optional

Path to the input directory for the pipeline (e.g. "full/path/to/pipeline/input/"). Defaults to the value defined in the [input] section of the config file.

output_directorystr, optional

Path to the output directory for the pipeline (e.g. "full/path/to/pipeline/output/"). Defaults to the value defined in the [output] section of the config file.

data_directorystr, optional

Path to the data directory for the pipeline (e.g. "full/path/to/pipeline/data/"). Defaults to the value defined in the [data] section of the config file.

eco_file_structurebool, optional

Set to True when the pipeline runs on Ecotope’s server so that Windows drive-letter prefixes (R:, F:) are remapped to the correct POSIX mount points. Defaults to False.

Raises:
Exception

If config_file_path does not exist on the filesystem.

Exception

If the [input] section or its directory key is missing from the config file.

Exception

If the [output] section or its directory key is missing from the config file.

Exception

If the [data] section is missing or contains no recognised data source configuration.

Exception

If any of the resolved directory paths do not exist on the filesystem.

Attributes:
config_directorystr

Resolved path to the config.ini file.

input_directorystr

Resolved path to the pipeline input directory.

output_directorystr

Resolved path to the pipeline output directory.

data_directorystr

Resolved path to the pipeline data directory.

api_usrstr or None

API username read from the config file, if present.

api_pwstr or None

API password read from the config file, if present.

api_tokenstr or None

API token read from the config file, if present.

api_secretstr or None

API secret read from the config file, if present.

api_device_idstr or None

API device ID read from the config file, if present.

db_connection_infodict

Dictionary containing user, password, host, and database keys used to open MySQL connections.

Methods

connect_db()

Create a connection to the configured MySQL database.

connect_siteConfig_db()

Create a connection to the SiteConfig MySQL database.

get_db_name()

Return the name of the database that data will be uploaded to.

get_db_table_info(table_headers)

Read table configuration from the config file and return a combined info dict.

get_event_log_path()

Return the full path to the Event_Log.csv file.

get_fm_device_id()

Return the configured API device ID.

get_fm_token()

Retrieve a Field Manager API token using the configured credentials.

get_ls_df([ls_file_name])

Load the load-shift schedule CSV and return it as a DataFrame.

get_ls_filename([ls_file_name])

Return the full path to the load-shift CSV file if it exists.

get_site_name([config_key])

Return the site name derived from the configured minute-table name.

get_skycentrics_token([request_str, date_str])

Generate a Skycentrics HMAC-SHA1 authentication token.

get_table_name(header)

Return the table_name value for the given config file section.

get_thingsboard_token()

Retrieve a ThingsBoard API JWT token using the configured credentials.

get_var_names_path()

Return the full path to the Variable_Names.csv file.

get_weather_dir_path()

Return the path to the directory that holds NOAA weather data files.

connect_db() [<class 'mysql.connector.connection.MySQLConnection'>, <class 'mysql.connector.cursor.MySQLCursor'>]

Create a connection to the configured MySQL database.

Uses the host, user, password, and database name stored in db_connection_info. Prints a message and returns (None, None) if the connection attempt fails.

Returns:
tuple

A 2-tuple of (mysql.connector.MySQLConnection, mysql.connector.cursor.MySQLCursor). The cursor can be used to execute MySQL queries and the connection object can be used to commit those changes. Both elements are None if the connection could not be established.

connect_siteConfig_db() -> (<class 'mysql.connector.connection.MySQLConnection'>, <class 'mysql.connector.cursor.MySQLCursor'>)

Create a connection to the SiteConfig MySQL database.

Uses the same host, user, and password stored in db_connection_info but always connects to the SiteConfig database regardless of the database name in the config file. Prints a message and returns (None, None) if the connection attempt fails.

Returns:
tuple

A 2-tuple of (mysql.connector.MySQLConnection, mysql.connector.cursor.MySQLCursor). The cursor can be used to execute MySQL queries and the connection object can be used to commit those changes. Both elements are None if the connection could not be established.

get_db_name() str

Return the name of the database that data will be uploaded to.

Returns:
str

The database name from the stored connection info.

get_db_table_info(table_headers: list) dict

Read table configuration from the config file and return a combined info dict.

For each header in table_headers, the corresponding table_name value is read from the matching section of config.ini. The name of the configured database is also included in the result under the key "database".

Parameters:
table_headerslist

Section headers from config.ini whose table_name values should be retrieved. Each entry must exactly match a section name in the config file.

Returns:
dict

A dictionary mapping each header to a nested dict with a "table_name" key, plus a top-level "database" key containing the database name from the stored connection info.

get_event_log_path() str

Return the full path to the Event_Log.csv file.

The file is expected to reside directly inside the pipeline’s input directory (e.g. "full/path/to/pipeline/input/Event_Log.csv").

Returns:
str

Absolute path to Event_Log.csv.

get_fm_device_id() str

Return the configured API device ID.

Returns:
str

The device ID string from the configuration file.

Raises:
Exception

If device_id (or fieldManager_device_id) was not provided in the configuration file.

get_fm_token() str

Retrieve a Field Manager API token using the configured credentials.

Sends a GET request to the FieldPop login endpoint with the stored api_usr and api_pw credentials. Prints a message and returns None if the HTTP request fails or an exception is raised.

Returns:
str or None

The Field Manager API token string on success, or None if the token could not be retrieved.

Raises:
Exception

If api_usr or api_pw were not provided in the configuration file.

get_ls_df(ls_file_name: str = 'load_shift.csv') DataFrame

Load the load-shift schedule CSV and return it as a DataFrame.

Reads the CSV file from the input directory, parses the date and startTime columns into a startDateTime column, and the date and endTime columns into an endDateTime column. If the file does not exist, a warning is printed and an empty DataFrame is returned.

Parameters:
ls_file_namestr, optional

Name of the load-shift CSV file located in the pipeline’s input directory. Defaults to 'load_shift.csv'.

Returns:
pd.DataFrame

DataFrame containing the load-shift schedule with additional startDateTime and endDateTime columns, or an empty DataFrame if the file does not exist.

get_ls_filename(ls_file_name: str = 'load_shift.csv') str

Return the full path to the load-shift CSV file if it exists.

Constructs the full path by joining the input directory with ls_file_name. Returns an empty string if the file does not exist or if ls_file_name is an empty string.

Parameters:
ls_file_namestr, optional

Name of the load-shift CSV file located in the pipeline’s input directory. Defaults to 'load_shift.csv'.

Returns:
str

Full path to the load-shift CSV file, or an empty string if the file does not exist.

get_site_name(config_key: str = 'minute') str

Return the site name derived from the configured minute-table name.

The site name is read as the table_name value from the section identified by config_key in config.ini.

Parameters:
config_keystr, optional

Section header in config.ini that points to the minute-level table for the site. The table_name value of this section is used as the site name. Defaults to "minute".

Returns:
str

The site name (i.e. the table_name value for the given section).

get_skycentrics_token(request_str: str = 'GET /api/devices/ HTTP/1.', date_str: str = None) tuple

Generate a Skycentrics HMAC-SHA1 authentication token.

Constructs a signed token by combining the configured api_token and a base64-encoded HMAC-SHA1 signature derived from api_secret, the request string, the date string, and an MD5 hash of an empty body.

Parameters:
request_strstr, optional

The HTTP request line used as part of the signature input (e.g. 'GET /api/devices/ HTTP/1.'). Defaults to 'GET /api/devices/ HTTP/1.'.

date_strstr, optional

The date string to include in the signature, formatted as '%a, %d %b %H:%M:%S GMT'. Defaults to the current UTC time formatted in that style.

Returns:
tuple

A 2-tuple of (token, date_str) where token is the "<api_token>:<signature>" string and date_str is the date string that was used (either the supplied value or the generated one).

get_table_name(header: str) str

Return the table_name value for the given config file section.

Parameters:
headerstr

Section header in config.ini whose table_name value should be retrieved.

Returns:
str

The table_name value found under the specified section.

get_thingsboard_token() str

Retrieve a ThingsBoard API JWT token using the configured credentials.

Sends a POST request to the ThingsBoard Cloud login endpoint with the stored api_usr and api_pw credentials. Prints a message and returns None if the HTTP request fails or an exception is raised.

Returns:
str or None

The ThingsBoard JWT token string on success, or None if the token could not be retrieved.

Raises:
Exception

If api_usr or api_pw were not provided in the configuration file.

get_var_names_path() str

Return the full path to the Variable_Names.csv file.

The file is expected to reside directly inside the pipeline’s input directory (e.g. "full/path/to/pipeline/input/Variable_Names.csv").

Returns:
str

Absolute path to Variable_Names.csv.

get_weather_dir_path() str

Return the path to the directory that holds NOAA weather data files.

The directory is expected to reside directly inside the pipeline’s data directory (e.g. "full/path/to/pipeline/data/weather").

Returns:
str

Path to the weather subdirectory within the data directory.