fovi

Welcome to the fovi codebase, a PyTorch library for implementing foveated vision. This library provides tools for foveated sampling and an interface to deep vision models, including CNNs and ViTs.

🛠️ Install

First, create a fresh conda environment:

conda create -n fovi python=3.9 # 3.9 is only necessary if using ffcv, see below
conda activate fovi

Clone the repo and enter it:

git clone https://github.com/nblauch/fovi.git
cd fovi

Now, for installing our package. The easiest installation is without ffcv, as ffcv rquires Python 3.9 and other harder dependencies. Installing without it will allow you to use everything in our code-base except the training functionality that leverages ffcv. If you want training functionality with ffcv, see below. You could also use your own training scripts with our models.

For the easy install, with your new environment activated, just do:

# from within the fovi repo
pip install -e . # this will automatically install fovi/requirements.txt

To install with ffcv to allow fast training, we first follow the instructions to install ffcv-ssl, which has stricter requirements, and then install fovi and its requirements. With your fovi conda environment activated, do:

conda install cupy pkg-config compilers libjpeg-turbo opencv pytorch torchvision torchaudio pytorch-cuda numba -c pytorch -c nvidia -c conda-forge
pip install git+https://github.com/facebookresearch/FFCV-SSL.git
# from within the fovi repo
pip install -e .

To use flash attention, install per the typical approach:

pip install packaging ninja
pip install flash-attn --no-build-isolation

🤗 Pretrained Models

Pretrained models are hosted on HuggingFace Hub and are automatically downloaded on first use:

Model

Size

Description

``fovi -dinov3-hplus_a- 2.78_res-64_in1k ` <https://hugg ingface.co/fovi- pytorch/fovi-din ov3-hplus_a-2.78 _res-64_in1k>`__

~3.4 GB

ViT-H/16+ backbone, high foveation (a=2.78)

``fovi -dinov3-splus_a- 2.78_res-64_in1k ` <https://hugg ingface.co/fovi- pytorch/fovi-din ov3-splus_a-2.78 _res-64_in1k>`__

~131 MB

ViT-S/16+ backbone, high foveation (a=2.78)

``fovi-d inov3-splus_a-60 .94_res-64_in1k ` <https://huggi ngface.co/fovi-p ytorch/fovi-dino v3-splus_a-60.94 _res-64_in1k>`__

~131 MB

ViT-S/16+ backbone, low foveation (a=60.94)

``fovi-a lexnet_a-1_res-6 4_rfmult-1_in1k ` <https://huggi ngface.co/fovi-p ytorch/fovi-alex net_a-1_res-64_r fmult-1_in1k>`__

~24 MB

AlexNet, high foveation (a=1), rfmult=1 (matched resolution kernel reference frame)

``fovi-a lexnet_a-1_res-6 4_rfmult-2_in1k ` <https://huggi ngface.co/fovi-p ytorch/fovi-alex net_a-1_res-64_r fmult-2_in1k>`__

~69 MB

AlexNet, high foveation (a=1), rfmult=2 (default higher-resolution kernel reference frame)

from fovi import get_model_from_base_fn

# Models are automatically downloaded from HuggingFace Hub on first use
model = get_model_from_base_fn('fovi-dinov3-splus_a-2.78_res-64_in1k')

📝 Example notebooks

notebooks/step0_sensor_manifold : explore the basic concepts involved in our foveated sensor

notebooks/step1_sampling.ipynb : learn how to do foveated sampling from images

notebooks/step2_knnconv.ipynb : learn how to build kNN-convolutional neural networks to process foveated sensor outputs

notebooks/step3_dinov3.ipynb : work with a state-of-the-art foveated vision system based on the DINOv3 ViT model, adapted to handle foveated inputs.

notebooks/step4_get_activations.ipynb: use hooks to extract intermediate activations from a model, and explore the Trainer class

📚 Documentation

The docs are hosted at: https://nblauch.github.io/fovi/index.html

You can also build locally. Docs are generated semi-automatically from source code and docstrings. The documentation includes:

  • API Reference: Complete documentation of all functions, classes, and modules

  • User Guide: Installation, quickstart, and usage examples

  • Developer Guide: Contributing guidelines and development setup

To do so:

# Install documentation dependencies
pip install -r requirements-docs.txt

# Generate documentation
python scripts/generate_docs.py

# View the documentation
open docs/_build/html/index.html

# View documentation on a remote cluster (need to forward the port separately, this is done automatically in VScode/Cursor)
python -m http.server 8000 --directory docs/_build/html

🏛️ Citation

Blauch, N. M., Alvarez, G. A., & Konkle, T. (2026). FOVI: A biologically-inspired foveated interface for deep vision models. https://arxiv.org/abs/2602.03766