This repository modernizes the exercises from Neural Networks and Deep Learning so the material can be demonstrated live in the classroom. The legacy NumPy/Theano scripts are still available for reference, but the default experience now uses PyTorch together with an interactive terminal dashboard that visualizes gradient descent in real time.
- PyTorch training stack – CUDA-ready models for RTX 20/30/40/50-series GPUs with optional mixed precision.
- Live Rich dashboard – animated loss and accuracy charts, gradient norms, step-by-step logs, and ASCII art previews of the current mini-batch so students can “see” what the network is learning.
- Checkpointing & metrics export – periodic checkpoints capture the model, optimizer, and scheduler states; JSONL metrics logs can be replayed or plotted later.
- Typer CLI – launch scripted runs or quick classroom demos with a single command.
- Colab-friendly notebook helper – collect training metrics directly in Python with
run_notebook_trainingfor plotting inside Google Colab.
-
Install Python – Python 3.10 or newer is recommended.
-
Set up a virtual environment (optional but encouraged):
python -m venv .venv source .venv/bin/activate -
Install dependencies. Choose the right PyTorch build for your hardware.
-
CPU-only:
pip install --upgrade pip pip install torch torchvision --index-url https://download.pytorch.org/whl/cpu pip install -e . -
NVIDIA RTX-5060 Ti (CUDA 12.4): Install the up-to-date CUDA-enabled wheels directly from the PyTorch index. Make sure your NVIDIA driver is recent enough for CUDA 12.4 (driver 550.xx or newer as reported by
nvidia-smi).pip install --upgrade pip pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124 pip install -e .These wheels bundle the appropriate CUDA runtime, so you do not need to install a separate CUDA toolkit for Colab or local demos.
Installing the project in editable mode exposes the
dlpcommand line tool described below. -
Running the project in Colab is now a first-class experience. The CLI automatically disables the Rich live dashboard when it detects a non-interactive output stream, and the package exposes a notebook helper that returns metrics directly to Python.
-
Open a new Colab notebook and install the package:
!pip install --quiet torch torchvision !pip install --quiet git+https://github.com/<REPO_OWNER>/DeepLearningPython.git
-
Import the helper and launch a short fake-data run. The helper forces
enable_live=Falseand can return a pandas DataFrame whenreturn_dataframe=True:from pathlib import Path from deeplearning_python import run_notebook_training result = run_notebook_training( epochs=1, batch_size=64, fake_data=True, data_dir=Path("/content/data"), log_dir=Path("/content/logs"), checkpoint_dir=Path("/content/checkpoints"), limit_train_batches=2, limit_val_batches=1, return_dataframe=True, ) metrics_df = result["metrics"] metrics_df.head()
-
Plot metrics or continue experimenting as you would locally. Metrics are written to
/content/logs, and checkpoints land in/content/checkpointsby default. Mount Google Drive first if you want these artifacts to persist across sessions.
The repository ships a ready-to-run example notebook at notebooks/colab_demo.ipynb that strings these steps together for your classroom or workshop.
The new command line interface lives at src/deeplearning_python/cli.py. After installing the package you can launch the interactive trainer:
dlp train --epochs 5 --batch-size 128 --learning-rate 0.05 --model simpleThe trainer will:
- stream loss/accuracy plots, gradient norms, and learning rate updates using Rich’s live layout
- display ASCII renderings of the digits in the current mini-batch together with predicted labels
- write checkpoints to
./artifacts/checkpoints/step_XXXXXXX.pt - append structured metrics to
./artifacts/logs/metrics.jsonl
When the CLI detects a non-interactive environment (such as a notebook output cell), it automatically suppresses the live Rich dashboard so the logs stay readable. Launch the command from a local terminal to re-enable the animated dashboard.
Need a quick classroom walkthrough? dlp demo caps the run to a small number of steps, increases logging frequency, and still produces visual updates:
dlp demo --steps 100 --batch-size 64On Windows you can double-click the helper batch files in scripts/windows/ to launch the Typer CLI without opening a terminal manually. demo.bat starts the classroom-friendly demo (py -m deeplearning_python.cli demo) and train.bat launches the full training workflow (py -m deeplearning_python.cli train).
These helpers rely on the Python py launcher that ships with the official Windows installers. If you installed Python from the Microsoft Store or another distribution, ensure the py command is available on your PATH before using the batch files.
Common options (see dlp train --help for the full list):
--model:simple,regularized, orconv--model-preset: choose preset hidden sizes and dropout for the simple MLP (baseline,compact,wide_dropout)--hidden-sizes: adjust the layer sizes of the simple MLP (add each value as its own flag, e.g.--hidden-sizes 256 --hidden-sizes 128 --hidden-sizes 64); providing this option overrides the preset--dropout: override the preset dropout probability for the simple MLP--scheduler: enablesteplrorcosinelearning rate schedules--checkpoint-interval: how many optimizer steps to wait between checkpoints--preview-interval: how often to refresh the mini-batch ASCII preview--mixed-precision: turn on CUDA AMP for faster GPU demos--fake-data: run against lightweight synthetic data when you do not have network access
Presets provide a quick way to explore architectures without typing out every layer. For example, --model-preset wide_dropout configures layers (256, 128, 64) with dropout 0.1. You can still tweak either parameter directly: --hidden-sizes changes the layer layout while keeping the preset's dropout, and --dropout fully replaces it. All options are compatible with CPU-only environments, so instructors can rehearse on a laptop before moving to the classroom GPU workstation.
Checkpoints capture the model, optimizer, scheduler, and metadata and are saved in the directory passed with --checkpoint-dir. Resume training later with:
dlp train --resume-from artifacts/checkpoints/step_0000200.ptThe JSONL metrics file contains structured entries for both training and validation. You can load it in pandas, Excel, or your visualization tool of choice for post-class analysis.
The original network.py, network2.py, network3.py, and test.py files remain untouched for historical reference. They rely on Python 3.5 and Theano and are no longer maintained, but feel free to keep them for comparison during lessons.
Install the optional development extras and run the test suite:
pip install -e .[dev]
pytestPull requests and classroom-inspired enhancements are welcome!