Skip to content

[WIP] Migration to structured configurations #3193

@deruyter92

Description

@deruyter92

Migrating to typed and validated configurations

This Issue will act as a placeholder to summarize the work in progress for migrating from plain dictionaries to typed and validated configuration classes. Originally this idea was formulated in #3172, but to facilitate easier review and a smooth migration the progress will be spilt into some intermediate steps with separate PRs. The old configuration system will stay in place during the intermediate steps and a versioning system will be put in place to facilitate long-term backwards compatibility across config versions.

Procedure

The migration will consist of small PRs subsequent changes, which can be reviewed separately, before merging into a feature branch.

Work in progress / roadmap

(ticking after review and merging into the feature branch)

  • Centralize project configuration logic
  • Improve testing to capture existing behavior
  • New config_mixin for easy moving between types (yaml file <> dict <> dataclass <> DictConfig)
  • Add typed configs for project config, pytorch config, etc. with pydantic validation
  • Add typed 3D Project Configs
  • Add typed configs for tensorflow
  • Replace dictionary configs with identical DictConfig version (smooth transition)
  • Add versioning system for migration between old and new configs
  • Improve configurations -> reduce duplicated fields, correct casing
  • Address in-place configuration edits throughout the pipeline
  • Add aliasing system for accessing new fields using old fieldnames (e.g. corer2move2 -> corner2move2)
  • Replace loaders in core config (e.g. deprectate read_config in favor of typed ProjectConfig)
  • Fix circular imports in core/config
  • Verify test coverage and improve if necessary
  • Improve loaders in e.g. deeplabcut/pose_estimation_pytorch/data/base.py (mismatching init, mixed project/pose logic)
  • Consider removing OmegaConf dependency and migrate to fully typed
  • add LazyConfig ?

Related PRs

  1. [dev] C1 - centralize project config I/O and add testing #3190
  2. [dev] C2 - Add typed configs as pydantic dataclasses & omegaconf dictconfigs #3191
  3. [dev] C3 - Replace configuration dictionaries with DictConfigs. #3194
  4. [dev] C4 - add config migration system #3197
    ...
    [WIP] Final migration to configuration version 1: structured and validated configs #3198

Motivation

As formulated by @arashsm79 in #3172

Summary

  • Introduce new configuration classes for inference, logging, model, pose, project, runner, and training settings.
  • Refactore data loading mechanisms to utilize new configuration structures.
  • Move the multithreading and compilation options in inference configuration to the config module.
  • Typed configuration for logging.
  • Update dataset loaders to accept model configurations directly or via file paths.

Why Typed & Structured Configuration (OmegaConf + Pydantic)

  • Strong guarantees for correctness

    • Runtime type safety ensures invalid configs fail fast with clear errors instead of silently producing incorrect training runs.
    • Schema-validated configs dramatically reduce debugging time for users and maintainers.
  • Static typing improves developer velocity

    • IDE autocomplete and inline documentation make configs discoverable and self-documenting.
    • Refactors become safer: config changes are more likely to be caught at development time.
  • Hierarchical, composable configuration

    • Natural representation of DeepLabCut’s nested project/model/training settings.
    • Easy composition and merging from multiple sources (base config, model presets, experiment overrides).
  • Cleaner overrides and defaults.

  • Structured configs make it easier to define parameter ranges for tuning and automation.

  • Config schemas can be versioned and evolve safely over time while preserving backward compatibility.

  • Full, validated configuration can be saved alongside results, which improves reproducibility and transparency.

  • Builds on well-maintained, widely adopted libraries (OmegaConf, Pydantic).

Resources for knowing more about structured configs:

Future Work

  • Currently default model definitions are still stored as yaml files in the package. Moving to LazyConfig as in Detectron 2 would improve things significantly.

More things that could be done ( @deruyter92 ):

  • I think we need to make sure that everytime a model is used, all the changes to the project's config.yaml are reflected in the model's configuration under metadata as well.
  • There might be a better way to handle things in deeplabcut/pose_estimation_pytorch/data/base.py.

Metadata

Metadata

Assignees

No one assigned

    Labels

    WORK IN PROGRESS!developers are currently working on this feature... stay tuned.backwards compatibilityissues concerning prior to current versionsenhancementNew feature or request

    Type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions