Knowledge Management for Data Science (KMDS)

Capture, organize, and reuse knowledge from your data science experiments.

🌟 What is KMDS?

KMDS is an ontology-backed ecosystem for systematic knowledge management in data science and analytics workflows. It documents the incremental process of experimentation, data exploration, and model selection—capturing decisions, rationale, and repository schemas so that valuable insights are never lost over time.

The Problem It Solves

Experimental work generates a fragmented stream of insights, local documentation, and Jupyter notebooks. This context is typically lost when a research trail goes cold. The KMDS ecosystem fixes this by providing a unified, structured approach to log, map, search, and visually audit your data engineering artifacts.

Who Can Use KMDS?

User	How they interact with KMDS
Data scientist	Python API, local LLM integrations, notebooks, and CLI framework
Software developer	Automated repo mapping utilities and pipeline automated logging hooks
Business analyst	Interactive UI Workbench Dashboard and plain-English natural language ingestion

🎥 Watch a quick overview of KMDS: YouTube Video

✨ Key Features

Interactive UI Workbench (kmds-ui): View, edit, and safely serialize knowledge graphs with special handling for long text notes and file context preservation.
Automated Repository Scanning (kmds-data-helper): Parse local codebases using a multi-persona engine (Data Scientist, Tech Lead, Architect) to synthesize documentation and code into structured knowledge graphs.
Natural Language Ingestion: Describe insights in plain English for automatic logging to your ontology graph.
Semantic Vector Search: Build high-performance local vector indices for querying analytical findings.
LLM Search Orchestration: Use Ollama-powered, intelligent routing for complex knowledge queries.
Enterprise Ready: KMDS is meant to be used within a git repository. It inherits the security context of the repository it is used with. Please see this document

📂 KMDS Data Helper (`kmds-data-helper`)

The kmds-data-helper package introduces a multi-persona analysis framework for existing data science repositories. Using local LLMs (via Ollama), it scans documentation, schemas, and notebooks to output complete KMDS knowledge graphs.

Key Features

Toggleable Role Personas: Switch between Data Scientist, Tech Lead, and Architect behaviors via a kmds_config.yaml file.
Automated Artifact Synthesis: Scans directories to auto-generate structured diagnostic files (full_service_report.json, kmds_summary.json).
Direct Graph Production: Compiles generated report structures directly into a standardized project_knowledge_graph.xml.

🖥️ KMDS Workbench UI

The kmds-ui extension package provides a specialized web dashboard custom-engineered to view, audit, and modify knowledge graph files generated by the KMDS ecosystem. It prevents namespace prefix corruption or structural layout degradation common in general-purpose ontology utilities (like Protégé).

Key Technical Advantages

Prefix-Agnostic Processing: Splits and parses XML fragments dynamically at runtime to handle KMDS files lacking explicit namespace declarations without crashing.
Proportional Narrative Isolation: Uses a 75% proportional grid with dynamic word-wrapping to cleanly display long text fields without text truncation.
Preserved File Context: Automatically tracks the original file name during ingestion, ensuring the updated graph downloads with matching name signatures.

🚀 Getting Started

1. Installation

Install the entire modular framework directly from PyPI:

# Install core logging, UI, and data helper
pip install kmds kmds-ui kmds-data-helper

2. Using the Interactive UI

Launch your workbench application from the terminal:

kmds-workbench

Open http://127.0.0 in your browser.

3. Automatically Building Graphs from Repositories (`kmds-data-helper`)

Set up a project directory containing documents/, notebooks/, and data_dictionary/.

Run the automatic aggregator tool:

kmds-kb --workspace . --project-file project_knowledge_graph.xml --mode auto

To parse individual report paths directly, use the adapter interface command:

kmds-analyze --input output/full_service_report.json --project-file project_knowledge_graph.xml --create-project --workflow-name kmds_project_workflow --mode auto

4. Quick Summary Logging via CLI (`kmds`)

kmds-summary-log \
  --summary "Daily reporting workflow for support operations." \
  --workflow-name "support_reporting" \
  --workflow-type application \
  --project-file ./support_reporting.xml \
  --create-project --no-prompt

5. Executing Semantic Knowledge Queries

kmds-search \
  --kb ./project_knowledge_graph.xml \
  --query "What data quality issues were identified?" \
  --n-results 3

The full documentation covers custom LLM functions, available routing templates, and output formats.

This repository includes two detailed examples:

Analytics Example: Evaluates the effectiveness of a ticket resolution help desk.
Machine Learning Example: Uses Principal Component Analysis (PCA) to summarize online store sales activity.
- Notebooks
- Infographic

Name		Name	Last commit message	Last commit date
Latest commit History 245 Commits
.github		.github
build_instructions		build_instructions
design_considerations		design_considerations
docs		docs
example_documentation		example_documentation
examples_of_use		examples_of_use
high_level_reports		high_level_reports
images		images
src/kmds		src/kmds
tests		tests
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
=2.0		=2.0
Changelog.md		Changelog.md
README.md		README.md
index.html		index.html
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Knowledge Management for Data Science (KMDS)

🌟 What is KMDS?

The Problem It Solves

Who Can Use KMDS?

✨ Key Features

📂 KMDS Data Helper (`kmds-data-helper`)

Key Features

🖥️ KMDS Workbench UI

Key Technical Advantages

🚀 Getting Started

1. Installation

2. Using the Interactive UI

3. Automatically Building Graphs from Repositories (`kmds-data-helper`)

4. Quick Summary Logging via CLI (`kmds`)

5. Executing Semantic Knowledge Queries

🤝 Contributing

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Knowledge Management for Data Science (KMDS)

🌟 What is KMDS?

The Problem It Solves

Who Can Use KMDS?

✨ Key Features

📂 KMDS Data Helper (kmds-data-helper)

Key Features

🖥️ KMDS Workbench UI

Key Technical Advantages

🚀 Getting Started

1. Installation

2. Using the Interactive UI

3. Automatically Building Graphs from Repositories (kmds-data-helper)

4. Quick Summary Logging via CLI (kmds)

5. Executing Semantic Knowledge Queries

🤝 Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

📂 KMDS Data Helper (`kmds-data-helper`)

3. Automatically Building Graphs from Repositories (`kmds-data-helper`)

4. Quick Summary Logging via CLI (`kmds`)

Packages