Configuration
The fisseq_data_pipeline.config module provides a Config object for
managing pipeline configuration. It allows loading configuration values from
YAML files, Python dictionaries, or other Config objects, and ensures that
all values are validated against a default configuration.
Overview
Config: A wrapper around a validated configuration dictionary.- Loads from a path, dictionary,
Config, or falls back to the defaultconfig.yaml. - Allows access via both attribute-style (
cfg.feature_cols) and dictionary-style (cfg["feature_cols"]). -
Automatically fills in missing keys from the default configuration and removes invalid keys.
-
DEFAULT_CFG_PATH: The path to the default configuration YAML file that ships with the pipeline.
Example usage
from fisseq_data_pipeline.config import Config
# Load default configuration
cfg = Config(None)
# Load from a YAML file
cfg = Config("my_config.yaml")
# Load from a Python dict
cfg = Config({"feature_cols": ["f1", "f2"], "batch_col_name": "batch"})
# Load from an existing Config
cfg2 = Config(cfg)
# Access values
print(cfg.feature_cols)
print(cfg["batch_col_name"])
Validation Behavior
When initializing a Config:
- Invalid keys not present in the default configuration are removed with a warning.
- Missing keys are filled with the default values from
config.yaml.
This ensures that the configuration is always complete and consistent with the pipeline defaults.
API Reference
fisseq_data_pipeline.config.Config
A configuration object that wraps a dictionary of key-value pairs
loaded from a provided path, dictionary, or another Config instance.
If no configuration is provided, the default configuration file is used.
| Parameters: |
|
|---|
Source code in src/fisseq_data_pipeline/config.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 | |
__getattr__(name)
Retrieve a configuration value as an attribute.
Source code in src/fisseq_data_pipeline/config.py
47 48 49 | |
__getitem__(key)
Retrieve a configuration value using dictionary-style indexing.
Source code in src/fisseq_data_pipeline/config.py
51 52 53 | |