Pipeline Configuration Documentation¶
- class utils.ConfigManager.ConfigManager(config_file_path: str = 'config.ini', input_directory: str = None, output_directory: str = None, data_directory: str = None, eco_file_structure: bool = False)¶
Manage configuration, directory paths, database credentials, and API tokens for a data pipeline.
Reads a
config.inifile to populate connection and directory settings. Optionally remaps Windows-style drive letters to POSIX paths when running on Ecotope’s server infrastructure.- Parameters:
- config_file_pathstr, optional
Path to the
config.inifile for the pipeline (e.g."full/path/to/config.ini"). Must contain login information for the MySQL database where data is to be loaded. Defaults to"config.ini".- input_directorystr, optional
Path to the input directory for the pipeline (e.g.
"full/path/to/pipeline/input/"). Defaults to the value defined in the[input]section of the config file.- output_directorystr, optional
Path to the output directory for the pipeline (e.g.
"full/path/to/pipeline/output/"). Defaults to the value defined in the[output]section of the config file.- data_directorystr, optional
Path to the data directory for the pipeline (e.g.
"full/path/to/pipeline/data/"). Defaults to the value defined in the[data]section of the config file.- eco_file_structurebool, optional
Set to
Truewhen the pipeline runs on Ecotope’s server so that Windows drive-letter prefixes (R:,F:) are remapped to the correct POSIX mount points. Defaults toFalse.
- Raises:
- Exception
If
config_file_pathdoes not exist on the filesystem.- Exception
If the
[input]section or itsdirectorykey is missing from the config file.- Exception
If the
[output]section or itsdirectorykey is missing from the config file.- Exception
If the
[data]section is missing or contains no recognised data source configuration.- Exception
If any of the resolved directory paths do not exist on the filesystem.
- Attributes:
- config_directorystr
Resolved path to the
config.inifile.- input_directorystr
Resolved path to the pipeline input directory.
- output_directorystr
Resolved path to the pipeline output directory.
- data_directorystr
Resolved path to the pipeline data directory.
- api_usrstr or None
API username read from the config file, if present.
- api_pwstr or None
API password read from the config file, if present.
- api_tokenstr or None
API token read from the config file, if present.
- api_secretstr or None
API secret read from the config file, if present.
- api_device_idstr or None
API device ID read from the config file, if present.
- db_connection_infodict
Dictionary containing
user,password,host, anddatabasekeys used to open MySQL connections.
Methods
Create a connection to the configured MySQL database.
Create a connection to the
SiteConfigMySQL database.Return the name of the database that data will be uploaded to.
get_db_table_info(table_headers)Read table configuration from the config file and return a combined info dict.
Return the full path to the
Event_Log.csvfile.Return the configured API device ID.
Retrieve a Field Manager API token using the configured credentials.
get_ls_df([ls_file_name])Load the load-shift schedule CSV and return it as a DataFrame.
get_ls_filename([ls_file_name])Return the full path to the load-shift CSV file if it exists.
get_site_name([config_key])Return the site name derived from the configured minute-table name.
get_skycentrics_token([request_str, date_str])Generate a Skycentrics HMAC-SHA1 authentication token.
get_table_name(header)Return the
table_namevalue for the given config file section.Retrieve a ThingsBoard API JWT token using the configured credentials.
Return the full path to the
Variable_Names.csvfile.Return the path to the directory that holds NOAA weather data files.
- connect_db() [<class 'mysql.connector.connection.MySQLConnection'>, <class 'mysql.connector.cursor.MySQLCursor'>]¶
Create a connection to the configured MySQL database.
Uses the host, user, password, and database name stored in
db_connection_info. Prints a message and returns(None, None)if the connection attempt fails.- Returns:
- tuple
A 2-tuple of
(mysql.connector.MySQLConnection, mysql.connector.cursor.MySQLCursor). The cursor can be used to execute MySQL queries and the connection object can be used to commit those changes. Both elements areNoneif the connection could not be established.
- connect_siteConfig_db() -> (<class 'mysql.connector.connection.MySQLConnection'>, <class 'mysql.connector.cursor.MySQLCursor'>)¶
Create a connection to the
SiteConfigMySQL database.Uses the same host, user, and password stored in
db_connection_infobut always connects to theSiteConfigdatabase regardless of the database name in the config file. Prints a message and returns(None, None)if the connection attempt fails.- Returns:
- tuple
A 2-tuple of
(mysql.connector.MySQLConnection, mysql.connector.cursor.MySQLCursor). The cursor can be used to execute MySQL queries and the connection object can be used to commit those changes. Both elements areNoneif the connection could not be established.
- get_db_name() str¶
Return the name of the database that data will be uploaded to.
- Returns:
- str
The database name from the stored connection info.
- get_db_table_info(table_headers: list) dict¶
Read table configuration from the config file and return a combined info dict.
For each header in
table_headers, the correspondingtable_namevalue is read from the matching section ofconfig.ini. The name of the configured database is also included in the result under the key"database".- Parameters:
- table_headerslist
Section headers from
config.iniwhosetable_namevalues should be retrieved. Each entry must exactly match a section name in the config file.
- Returns:
- dict
A dictionary mapping each header to a nested dict with a
"table_name"key, plus a top-level"database"key containing the database name from the stored connection info.
- get_event_log_path() str¶
Return the full path to the
Event_Log.csvfile.The file is expected to reside directly inside the pipeline’s input directory (e.g.
"full/path/to/pipeline/input/Event_Log.csv").- Returns:
- str
Absolute path to
Event_Log.csv.
- get_fm_device_id() str¶
Return the configured API device ID.
- Returns:
- str
The device ID string from the configuration file.
- Raises:
- Exception
If
device_id(orfieldManager_device_id) was not provided in the configuration file.
- get_fm_token() str¶
Retrieve a Field Manager API token using the configured credentials.
Sends a GET request to the FieldPop login endpoint with the stored
api_usrandapi_pwcredentials. Prints a message and returnsNoneif the HTTP request fails or an exception is raised.- Returns:
- str or None
The Field Manager API token string on success, or
Noneif the token could not be retrieved.
- Raises:
- Exception
If
api_usrorapi_pwwere not provided in the configuration file.
- get_ls_df(ls_file_name: str = 'load_shift.csv') DataFrame¶
Load the load-shift schedule CSV and return it as a DataFrame.
Reads the CSV file from the input directory, parses the
dateandstartTimecolumns into astartDateTimecolumn, and thedateandendTimecolumns into anendDateTimecolumn. If the file does not exist, a warning is printed and an empty DataFrame is returned.- Parameters:
- ls_file_namestr, optional
Name of the load-shift CSV file located in the pipeline’s input directory. Defaults to
'load_shift.csv'.
- Returns:
- pd.DataFrame
DataFrame containing the load-shift schedule with additional
startDateTimeandendDateTimecolumns, or an empty DataFrame if the file does not exist.
- get_ls_filename(ls_file_name: str = 'load_shift.csv') str¶
Return the full path to the load-shift CSV file if it exists.
Constructs the full path by joining the input directory with
ls_file_name. Returns an empty string if the file does not exist or ifls_file_nameis an empty string.- Parameters:
- ls_file_namestr, optional
Name of the load-shift CSV file located in the pipeline’s input directory. Defaults to
'load_shift.csv'.
- Returns:
- str
Full path to the load-shift CSV file, or an empty string if the file does not exist.
- get_site_name(config_key: str = 'minute') str¶
Return the site name derived from the configured minute-table name.
The site name is read as the
table_namevalue from the section identified byconfig_keyinconfig.ini.- Parameters:
- config_keystr, optional
Section header in
config.inithat points to the minute-level table for the site. Thetable_namevalue of this section is used as the site name. Defaults to"minute".
- Returns:
- str
The site name (i.e. the
table_namevalue for the given section).
- get_skycentrics_token(request_str: str = 'GET /api/devices/ HTTP/1.', date_str: str = None) tuple¶
Generate a Skycentrics HMAC-SHA1 authentication token.
Constructs a signed token by combining the configured
api_tokenand a base64-encoded HMAC-SHA1 signature derived fromapi_secret, the request string, the date string, and an MD5 hash of an empty body.- Parameters:
- request_strstr, optional
The HTTP request line used as part of the signature input (e.g.
'GET /api/devices/ HTTP/1.'). Defaults to'GET /api/devices/ HTTP/1.'.- date_strstr, optional
The date string to include in the signature, formatted as
'%a, %d %b %H:%M:%S GMT'. Defaults to the current UTC time formatted in that style.
- Returns:
- tuple
A 2-tuple of
(token, date_str)wheretokenis the"<api_token>:<signature>"string anddate_stris the date string that was used (either the supplied value or the generated one).
- get_table_name(header: str) str¶
Return the
table_namevalue for the given config file section.- Parameters:
- headerstr
Section header in
config.iniwhosetable_namevalue should be retrieved.
- Returns:
- str
The
table_namevalue found under the specified section.
- get_thingsboard_token() str¶
Retrieve a ThingsBoard API JWT token using the configured credentials.
Sends a POST request to the ThingsBoard Cloud login endpoint with the stored
api_usrandapi_pwcredentials. Prints a message and returnsNoneif the HTTP request fails or an exception is raised.- Returns:
- str or None
The ThingsBoard JWT token string on success, or
Noneif the token could not be retrieved.
- Raises:
- Exception
If
api_usrorapi_pwwere not provided in the configuration file.
- get_var_names_path() str¶
Return the full path to the
Variable_Names.csvfile.The file is expected to reside directly inside the pipeline’s input directory (e.g.
"full/path/to/pipeline/input/Variable_Names.csv").- Returns:
- str
Absolute path to
Variable_Names.csv.
- get_weather_dir_path() str¶
Return the path to the directory that holds NOAA weather data files.
The directory is expected to reside directly inside the pipeline’s data directory (e.g.
"full/path/to/pipeline/data/weather").- Returns:
- str
Path to the
weathersubdirectory within the data directory.