paper

HyperbolicLR: Epoch-insensitive Learning Rate Scheduler - Experiments

This directory contains the experimental code, data, and analysis tools for the HyperbolicLR paper. It includes implementations of various deep learning models, data processing utilities, and scripts for running experiments and analyzing results.

Directory Structure

analyze/: Rust-based analysis tools for experimental results
data/: Dataset files
figs/: Generated figures and plots
learning_curve_example/: Rust-based learning curve analysis
learning_curves/: Rust-based learning curve plotting per scheduler
metric/: Rust-based metric calculation tools
osc/: Rust-based oscillation data generation
scripts/: Various Python scripts for plotting and analysis

Main Python Files

`scheduler_config.py`

Centralized scheduler configuration module. All scheduler-related logic (registry, parameter defaults, Optuna search spaces, factory creation) is defined here. The three main_*.py files import from this module, eliminating code duplication.

Supported schedulers:

Code	Scheduler	Type
`N`	No Scheduler	Baseline
`P`	PolynomialLR	Traditional
`C`	CosineAnnealingLR	Traditional
`E`	ExponentialLR	Traditional
`L`	LinearLR	Traditional
`S`	StepLR	Traditional
`OC`	OneCycleLR	Cyclic (per-batch)
`CY`	CyclicLR	Cyclic (per-batch)
`H`	HyperbolicLR	Hyperbolic
`EH`	ExpHyperbolicLR	Hyperbolic
`WH`	Warmup + HyperbolicLR	Hyperbolic + Warmup
`WEH`	Warmup + ExpHyperbolicLR	Hyperbolic + Warmup
`WC`	Warmup + CosineAnnealingLR	Traditional + Warmup

LR-free optimizers (Compare mode):

Name	Optimizer	Recommended LR
`Prodigy`	Prodigy	1.0
`DAdapt`	DAdaptAdam	1.0

`main_cifar10.py`

CIFAR-10/100 image classification experiments.

Models: SimpleCNN, ResNetCIFAR, SimpleViT
Run modes: Run, Compare, Optimize, Optimize-Compare
Integration with Weights & Biases for experiment tracking

`main_osc.py`

Oscillation prediction experiments with LSTM Sequence-to-Sequence model.

Data types: simple, damped, total
Run modes: Run, Compare, Optimize, Optimize-Compare

`main_integral.py`

Integral operator learning experiments.

Models: DeepONet, TFONet
Run modes: Run, Compare, Optimize, Optimize-Compare

`run_all.py`

Non-interactive All-in-One experiment runner. Automates the full Optimize → Optimize-Compare pipeline for all schedulers and LR-free optimizers via CLI arguments only.

Phase 1: Optuna TPE optimization for every scheduler
Phase 2: Epoch sensitivity comparison with best params
Saves results to results_{project}.json

`model.py`

Neural network architectures:

SimpleCNN: Basic convolutional neural network
ResNetCIFAR: ResNet-18/34/50 adapted for CIFAR 32x32 images (3x3 conv stem, no maxpool)
SimpleViT: Lightweight Vision Transformer (patch_size=4, embed_dim=192, depth=6)
LSTM_Seq2Seq: LSTM-based sequence-to-sequence model
DeepONet: Deep Operator Network
TFONet: Transformer-based Operator Network

`util.py`

Data loading and training utilities:

load_cifar10(), load_cifar100(): CIFAR data loading with optional subsetting (subset_ratio >= 1.0 skips subsetting)
Trainer: General-purpose trainer with step_per_batch support for OneCycleLR/CyclicLR
OperatorTrainer: Trainer for operator learning models with step_per_batch support

`result_plot.py`

Result visualization with grouped subplot design (Traditional / Cyclic+Warmup / Hyperbolic / LR-free).

`metric_plot.py`

Standalone LR curve visualization for all scheduler types.

Requirements

See requirements.txt for the list of Python dependencies.

For Rust components, ensure you have Rust installed. Dependencies are managed via Cargo.

Setup

cd paper

# UV (Recommended)
uv venv
uv pip sync requirements.txt

# Install LR-free optimizer dependencies
uv pip install prodigyopt dadaptation

# Activate
source .venv/bin/activate

Data Generation

Before running experiments, generate the required datasets:

# Oscillation data
cd osc && cargo run --release && cd ..

# Integral data
cd integral && cargo run --release && cd ..

# CIFAR-10/100 is downloaded automatically on first run

Running Experiments

All experiment scripts use an interactive CLI (via survey). Run them with uv:

uv run python main_cifar10.py
uv run python main_osc.py
uv run python main_integral.py

Run Modes (Interactive Scripts)

Each main_*.py script supports 4 interactive run modes:

Mode	Description
Run	Single run with manually specified hyperparameters
Compare	Run all scheduler x optimizer combinations with default parameters
Optimize	Optuna hyperparameter optimization (25 trials) for a single scheduler
Optimize-Compare	Load Optuna best trial and run across epoch budgets [50, 100, 150, 200]

All-in-One Workflow (Recommended)

The run_all.py script automates the entire Optimize → Optimize-Compare pipeline non-interactively:

# Full experiment: all schedulers + LR-free optimizers
uv run python run_all.py --task cifar10 --model ResNetCIFAR

# Quick dry run (sanity check)
uv run python run_all.py --task cifar10 --model SimpleCNN \
  --epochs 2 --n-trials 2 --n-seeds 1 --wandb-mode disabled

# Other tasks
uv run python run_all.py --task osc
uv run python run_all.py --task integral --model DeepONet

# Customize
uv run python run_all.py --task cifar10 --model ResNetCIFAR \
  --n-trials 10 --epochs 100 --schedulers EH,WEH,C --skip-lr-free

Key options:

Argument	Default	Description
`--task`	`cifar10`	Task: `cifar10`, `osc`, `integral`
`--model`	(task default)	Model name
`--optimizer`	`AdamW`	Base optimizer: `Adam`, `AdamW`
`--epochs`	`50`	Optimization epoch budget (low for epoch-insensitivity test)
`--n-trials`	`25`	Optuna trials per scheduler
`--n-seeds`	`1`	Seeds during optimization (1=fast, 5=robust)
`--compare-seeds`	`5`	Seeds during comparison phase
`--epoch-budgets`	`50,100,150,200`	Epoch budgets for comparison (ascending)
`--schedulers`	(all)	Comma-separated subset of schedulers
`--skip-lr-free`	`False`	Skip LR-free optimizers
`--skip-compare`	`False`	Skip Phase 2 (only optimize)
`--wandb-mode`	`online`	W&B mode: `online`, `offline`, `disabled`

Interactive Workflow

For individual experiments, use the interactive scripts directly:

Step 1: Pilot Run (Sanity Check)

uv run python main_cifar10.py
# Select: Run mode → CIFAR10 → subset_ratio=0.1 → SimpleCNN → batch_size=128 → epochs=10

Step 2: Optimize Hyperparameters

uv run python main_cifar10.py --project HyperbolicLR-CIFAR10-ResNet
# Select: Optimize mode → CIFAR10 → subset_ratio=1.0 → ResNetCIFAR → batch_size=128 → epochs=200
# Then select optimizer (AdamW) and scheduler (e.g., EH)

Step 3: Epoch Sensitivity Comparison

uv run python main_cifar10.py --project HyperbolicLR-CIFAR10-ResNet
# Select: Optimize-Compare mode → select study → runs epochs=[200, 150, 100, 50]

Step 4: Direct Comparison

uv run python main_cifar10.py --project HyperbolicLR-CIFAR10-ResNet
# Select: Compare mode → runs 2 optimizers × 13 schedulers + 2 LR-free = 28 runs

Full Experiment Matrix

To reproduce all paper results, run for each combination:

CIFAR-10 (Image Classification)

Model	subset_ratio	batch_size	epochs	lr	infimum_lr
SimpleCNN	0.1	128	200	0.01	1e-6
ResNetCIFAR	1.0	128	200	0.01	1e-6
SimpleViT	1.0	128	200	0.001	1e-6

OSC (Time Series Prediction)

Model	dtype	hist	pred	batch_size	epochs	lr	infimum_lr
LSTM_Seq2Seq	simple	50	50	64	200	0.01	1e-6

Integral (Operator Learning)

Model	batch_size	epochs	lr	infimum_lr
DeepONet	128	200	0.005	1e-6
TFONet	128	200	0.001	1e-6

Notes on Specific Schedulers

OneCycleLR (OC) / CyclicLR (CY): These schedulers step per-batch, not per-epoch. The step_per_batch flag is automatically set via is_step_per_batch() in scheduler_config.py.
Warmup variants (WH, WEH, WC): Warmup duration defaults to epochs // 10. Particularly important for ResNetCIFAR (BatchNorm) and SimpleViT (Transformer).
LR-free optimizers: Prodigy and DAdaptAdam use lr=1.0 and self-tune internally. They run without a scheduler (N).

Analysis

After running experiments, results are logged to W&B. To generate analysis outputs:

1. Export W&B Data

Export learning curves from W&B to CSV format (for Rust analysis tools):

data/Result_{Dataset}_{Model}-{Scheduler}.csv

Each CSV should contain columns like {Scheduler}{Epochs}_val_loss (e.g., EH200_val_loss).

2. Rust Analysis Tools

# Learning curve difference analysis (SLCD metric)
cd analyze && cargo run --release && cd ..

# Accuracy-specific analysis
cd analyze && cargo run --release --bin acc && cd ..

# Learning curve plots per scheduler
cd learning_curves && cargo run --release && cd ..

# LR scheduler metric evaluation
cd metric && cargo run --release && cd ..

3. Python Visualization

# LR schedule curves (all schedulers)
uv run python metric_plot.py

# Result bar charts (grouped by scheduler family)
uv run python result_plot.py

# Decay rate analysis (d(eta)/dt comparison)
uv run python scripts/03_decay_rate_analysis.py

# Cumulative LR budget analysis
uv run python scripts/04_cumulative_lr_budget.py

Scheduler Overview

Traditional Schedulers (Baselines)

N (No Scheduler): Constant LR (lr/10 for fair comparison)
L (LinearLR): Linear decay from lr to infimum_lr
S (StepLR): Step decay with gamma=0.1 every epochs//3 steps
P (PolynomialLR): Polynomial decay with power=0.5
C (CosineAnnealingLR): Cosine annealing to eta_min
E (ExponentialLR): Exponential decay with gamma=0.96

Cyclic / Warm-up Schedulers

OC (OneCycleLR): Super-convergence schedule (per-batch stepping)
CY (CyclicLR): Triangular cyclic schedule (per-batch stepping)
WC (Warmup + Cosine): SequentialLR(LinearLR → CosineAnnealingLR)

Hyperbolic Schedulers (Ours)

H (HyperbolicLR): Hyperbolic decay with epoch-insensitive property
EH (ExpHyperbolicLR): Exponential variant of HyperbolicLR
WH (Warmup + HyperbolicLR): Linear warmup → hyperbolic decay
WEH (Warmup + ExpHyperbolicLR): Linear warmup → exp-hyperbolic decay

LR-Free Optimizers

Prodigy: Prodigy optimizer (self-tuning learning rate)
DAdapt: D-Adaptation Adam (learning-rate-free Adam)

Additional Scripts

`scripts/01_poly_cos_lr.py`

Generates plots for Polynomial and Cosine Annealing LR schedulers at different epoch settings.

`scripts/02_prop_1.py`

Generates a plot demonstrating the properties of the hyperbolic function used in HyperbolicLR.

`scripts/plot_hyperbolic_lr.py`

Creates plots for HyperbolicLR and ExpHyperbolicLR LR curves and initial gradients.

`scripts/03_decay_rate_analysis.py`

Compares instantaneous decay rate d(eta)/dt across all schedulers. Visualizes why hyperbolic decay behaves differently.

`scripts/04_cumulative_lr_budget.py`

Compares cumulative LR budget integral(eta, 0, N) across schedulers for different epoch counts. Demonstrates that HyperbolicLR has similar cumulative budgets across different N.

Additional Notes

The bring.sh script copies the main hyperbolic_lr.py implementation into the experiment directory.
Rust components provide efficient data generation, analysis, and metric calculation.
All experiments are tracked via W&B. Set WANDB_MODE=offline for offline runs.
To add a new scheduler, add it only to scheduler_config.py — all three main_*.py files will pick it up automatically.

For more detailed information on the HyperbolicLR implementation, refer to the main README in the root directory.

Name		Name	Last commit message	Last commit date
parent directory ..
analyze		analyze
figs		figs
integral		integral
learning_curve_example		learning_curve_example
learning_curves		learning_curves
metric		metric
osc		osc
scripts		scripts
.gitignore		.gitignore
HyperbolicLR-OSC-LSTM_Seq2Seq.db		HyperbolicLR-OSC-LSTM_Seq2Seq.db
README.md		README.md
bring.sh		bring.sh
install_requirements.sh		install_requirements.sh
main_cifar10.py		main_cifar10.py
main_integral.py		main_integral.py
main_osc.py		main_osc.py
metric_plot.py		metric_plot.py
model.py		model.py
requirements.txt		requirements.txt
result_plot.py		result_plot.py
run_all.py		run_all.py
scheduler_config.py		scheduler_config.py
util.py		util.py

FilesExpand file tree

paper

Directory actions

More options

Directory actions

More options

Latest commit

History

paper

Folders and files

parent directory

README.md

HyperbolicLR: Epoch-insensitive Learning Rate Scheduler - Experiments

Directory Structure

Main Python Files

scheduler_config.py

main_cifar10.py

main_osc.py

main_integral.py

run_all.py

model.py

util.py

result_plot.py

metric_plot.py

Requirements

Setup

Data Generation

Running Experiments

Run Modes (Interactive Scripts)

All-in-One Workflow (Recommended)

Interactive Workflow

Step 1: Pilot Run (Sanity Check)

Step 2: Optimize Hyperparameters

Step 3: Epoch Sensitivity Comparison

Step 4: Direct Comparison

Full Experiment Matrix

Notes on Specific Schedulers

Analysis

1. Export W&B Data

2. Rust Analysis Tools

3. Python Visualization

Scheduler Overview

Traditional Schedulers (Baselines)

Cyclic / Warm-up Schedulers

Hyperbolic Schedulers (Ours)

LR-Free Optimizers

Additional Scripts

scripts/01_poly_cos_lr.py

scripts/02_prop_1.py

scripts/plot_hyperbolic_lr.py

scripts/03_decay_rate_analysis.py

scripts/04_cumulative_lr_budget.py

Additional Notes

`scheduler_config.py`

`main_cifar10.py`

`main_osc.py`

`main_integral.py`

`run_all.py`

`model.py`

`util.py`

`result_plot.py`

`metric_plot.py`

`scripts/01_poly_cos_lr.py`

`scripts/02_prop_1.py`

`scripts/plot_hyperbolic_lr.py`

`scripts/03_decay_rate_analysis.py`

`scripts/04_cumulative_lr_budget.py`