This roadmap outlines the current and future direction of ModelSkill — a toolkit for evaluating simulation model quality by comparing results against observations.
For questions or feature requests, please open a GitHub Discussion.
- Baseline Model Comparisons — Compare any model against synthetic baselines (mean, persistence) to quantify the added value of a simulation.
- Custom Metrics — Define domain-specific quality metrics that integrate fully into all skill tables and reports.
- Spatial and Temporal Skill Aggregation — Assess model performance by geographic region, time period, season, or any custom grouping to identify where and when a model performs well or poorly.
- Network Model Support — Compare MIKE 1D hydraulic network simulations against observations at network nodes, covering collection systems, water distribution, and river networks.
- Vertical Profile Assessment — Validate 3D models by comparing against depth-varying observations such as temperature and salinity profiles.
- Automatic Report Generation — Generate standardised model skill assessment reports in HTML, PDF, or PowerPoint from a single command.
- Band-Pass Filtering — Separate model skill assessment into slow dynamics and fast dynamics to understand where a model captures trends versus peaks.
- Ensemble and Probabilistic Forecast Support — Evaluate ensemble model runs using established probabilistic scoring methods alongside standard deterministic metrics.
- Forecast Lead-Time Analysis — Assess how model skill degrades with forecast horizon to optimise forecast update frequency and communicate prediction reliability.
- Outlier Detection — Automatically identify suspect observations using model-observation differences to improve data quality and skill assessment reliability.
- Rolling Skill Assessment — Track how model skill evolves over time using moving windows to detect performance trends and seasonal patterns.
- Web Application — Browser-based interface for model skill assessment, accessible to users without Python experience.