Skip to content

Configuration

Runtime-mutable, disk-persisted engine configuration.

LSMConfig

LSMConfig(path)

Thread-safe, disk-persisted engine configuration.

Read any field as an attribute::

config.max_memtable_size_mb   # → 64
config.max_memtable_bytes     # → 67108864 (convenience)
config.bloom_fpr              # → 0.05 (dev) or 0.01 (prod)

Update at runtime::

config.set("max_memtable_entries", 500_000)
config.set("bloom_fpr_prod", 0.001)
# writes to disk synchronously, all further reads see new value

Bloom filter configuration:

  • bloom_fpr_dev — false positive rate in dev mode (default 0.05). Higher FPR means smaller filters and faster builds.
  • bloom_fpr_prod — false positive rate in prod mode (default 0.01). Lower FPR means fewer false disk reads at the cost of larger filters.
  • The convenience property :attr:bloom_fpr auto-selects based on env.
  • The expected item count (bloom_n) is not configurable — it is derived from the actual data size at flush time (len(snapshot)) and compaction time (sum of input record counts).

Load or create engine configuration from path.

If the file exists, its contents are merged with built-in defaults so that new keys introduced in code upgrades are always present. If the file does not exist or is corrupt, defaults are used and persisted to disk.

Parameters:

Name Type Description Default
path Path

Filesystem path to config.json.

required
Source code in app/engine/config.py
def __init__(self, path: Path) -> None:
    """Load or create engine configuration from *path*.

    If the file exists, its contents are merged with built-in defaults
    so that new keys introduced in code upgrades are always present.
    If the file does not exist or is corrupt, defaults are used and
    persisted to disk.

    Args:
        path: Filesystem path to ``config.json``.
    """
    self._path = path
    self._lock = threading.RLock()
    self._data: dict[str, int | float | str] = {}
    self._load()

is_dev property

True when env is 'dev'.

is_prod property

True when env is 'prod'.

max_memtable_bytes property

Convenience: max_memtable_size_mb converted to bytes.

bloom_fpr property

False positive rate for bloom filters, based on current env.

Dev mode uses a higher FPR (smaller filters, faster builds). Prod mode uses a lower FPR (fewer false positives, larger filters).

__getattr__(name)

Read a config field by name. Thread-safe.

Source code in app/engine/config.py
def __getattr__(self, name: str) -> int | float | str:
    """Read a config field by name. Thread-safe."""
    # Avoid recursion during __init__
    if name.startswith("_"):
        raise AttributeError(name)
    with self._lock:
        if name in self._data:
            return self._data[name]
    raise AttributeError(
        f"Unknown config key: {name!r}. "
        f"Valid keys: {', '.join(sorted(_DEFAULTS))}"
    )

set(key, value)

Update a config field and persist to disk.

Returns (old_value, new_value). Raises :class:ConfigError if the key is unknown or the value is invalid.

Source code in app/engine/config.py
def set(
    self, key: str, value: int | float | str,
) -> tuple[int | float | str, int | float | str]:
    """Update a config field and persist to disk.

    Returns ``(old_value, new_value)``.
    Raises :class:`ConfigError` if the key is unknown or the value
    is invalid.
    """
    if key not in _DEFAULTS:
        raise ConfigError(
            f"Unknown config key: {key!r}. "
            f"Valid keys: {', '.join(sorted(_DEFAULTS))}"
        )

    expected_type = type(_DEFAULTS[key])

    # String fields: validate against allowed values
    if expected_type is str:
        str_value = str(value)
        if key == "env" and str_value not in _VALID_ENV_VALUES:
            raise ConfigError(
                f"Invalid value for {key}: {str_value!r}. "
                f"Must be one of {sorted(_VALID_ENV_VALUES)}"
            )
        cast_value: int | float | str = str_value
    else:
        try:
            cast_value = expected_type(value)
        except (ValueError, TypeError) as exc:
            raise ConfigError(
                f"Invalid value for {key}: {value!r} "
                f"(expected {expected_type.__name__})"
            ) from exc

        if cast_value <= 0:  # type: ignore[operator]
            raise ConfigError(
                f"Value for {key} must be positive, got {cast_value}"
            )

    with self._lock:
        old = self._data[key]
        self._data[key] = cast_value
        self._save()

    logger.info(
        "Config updated", key=key, old=old, new=cast_value,
    )
    return (old, cast_value)

to_dict()

Return a snapshot of all config values.

Source code in app/engine/config.py
def to_dict(self) -> dict[str, int | float | str]:
    """Return a snapshot of all config values."""
    with self._lock:
        return dict(self._data)

to_json(indent=2)

Return config as a formatted JSON string.

Source code in app/engine/config.py
def to_json(self, indent: int = 2) -> str:
    """Return config as a formatted JSON string."""
    return json.dumps(self.to_dict(), indent=indent)

load(data_root, config_path=None) classmethod

Load config from config_path, defaulting to <data_root>/config.json.

Source code in app/engine/config.py
@classmethod
def load(
    cls,
    data_root: Path,
    config_path: Path | None = None,
) -> LSMConfig:
    """Load config from *config_path*, defaulting to ``<data_root>/config.json``."""
    path = config_path or (data_root / "config.json")
    return cls(path)

Bloom Filter Settings

The bloom filter's false positive rate is environment-aware:

Key Dev default Prod default Description
bloom_fpr_dev 0.05 FPR used when env = "dev"
bloom_fpr_prod 0.01 FPR used when env = "prod"

The convenience property config.bloom_fpr auto-selects based on the current env.

The expected item count is derived from the actual data — not configurable:

  • Flush: exact snapshot size (len(snapshot))
  • Compaction: total input record count

Update at runtime:

engine.update_config("bloom_fpr_prod", 0.001)  # tighter FPR

ConfigError

ConfigError

Bases: LSMError

Configuration load or save failed.