Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
3678c57
added rtmpose head
n-poulsen Oct 22, 2024
094ba6f
added CSPNext backbone
n-poulsen Oct 22, 2024
ebebb3b
add code to load pretrained CSPNext weights
n-poulsen Oct 22, 2024
ed0cbee
update number of backbone channels
n-poulsen Oct 22, 2024
6a66683
update weight init
n-poulsen Oct 22, 2024
34f1baf
added possibility for SequentialLR, tests, docs, default for RTMPose
n-poulsen Oct 24, 2024
f737e53
fix scheduler/optimizer config update
n-poulsen Oct 25, 2024
62e288f
fix tests
n-poulsen Oct 28, 2024
0b49323
made top down crop work on non-square images, in numpy to make faster :)
n-poulsen Oct 30, 2024
6ed1ab3
added example to run top-down video analysis without the detector
n-poulsen Oct 30, 2024
8aab808
improved inits
n-poulsen Oct 30, 2024
f13f88e
edit rtmpose_x for default square bbox
n-poulsen Oct 30, 2024
5add563
add random bbox transform; fix top_down_crop
n-poulsen Oct 31, 2024
a8470d2
add missing transform method
n-poulsen Oct 31, 2024
27679eb
fix transform - label must be int
n-poulsen Oct 31, 2024
5ae8bea
only apply top-down crop on non-empty image
n-poulsen Oct 31, 2024
57da127
bug fix: collating scales
n-poulsen Oct 31, 2024
49c6a4a
set image dtype when no bounding box
n-poulsen Oct 31, 2024
eebbfe5
fix simcc keypoint weights when -1 or 2
n-poulsen Oct 31, 2024
083dcba
update scheduler settings for rtmpose
n-poulsen Oct 31, 2024
5d2dc6c
fix rtmpose-m config
n-poulsen Oct 31, 2024
91e40eb
bug fix: bbox from keypoints
n-poulsen Nov 1, 2024
1697e12
update lr and switch to linear warmup
n-poulsen Nov 1, 2024
8e6ad27
bug fix - input size
n-poulsen Nov 1, 2024
3635698
added documentation
n-poulsen Nov 1, 2024
76b1414
added code to download backbone weights from huggingface
n-poulsen Nov 7, 2024
1bd0564
bug fix; added s config
n-poulsen Nov 8, 2024
5f9dd9c
fix LR to 5e-05 after cosine annealing LR
n-poulsen Nov 8, 2024
26d9227
init learning rate to 5e-3
n-poulsen Nov 8, 2024
fe88d4f
Merge branch 'pytorch_dlc' into niels/rtmpose
n-poulsen Nov 8, 2024
be15e9e
update default LR
n-poulsen Nov 8, 2024
d8d8581
Merged pytorch_dlc changes
n-poulsen Nov 8, 2024
4528fab
Merge branch 'pytorch_dlc' into niels/rtmpose
n-poulsen Nov 11, 2024
3ee1f4c
Merge branch 'pytorch_dlc' into niels/rtmpose
n-poulsen Nov 22, 2024
f336d2d
Merge branch 'pytorch_dlc' into niels/rtmpose
n-poulsen Nov 25, 2024
97970fe
fix rtmpose_s config
n-poulsen Nov 25, 2024
694b467
bug fix - target generation
n-poulsen Nov 25, 2024
b18c69b
bug fix: CSPNeXt config
n-poulsen Nov 28, 2024
e3e7513
update user guide
n-poulsen Nov 28, 2024
445ad4a
update user guide
n-poulsen Nov 28, 2024
6938ee4
Merge branch 'pytorch_dlc' into niels/rtmpose
n-poulsen Dec 3, 2024
2c3e6bd
Merge branch 'pytorch_dlc' into niels/rtmpose
n-poulsen Dec 4, 2024
9d9cd34
fix NaNs propagating to non-visible keypoints
n-poulsen Dec 4, 2024
9d07696
update default configs
n-poulsen Dec 4, 2024
e4d8699
address github comments
n-poulsen Dec 10, 2024
7158a52
Merge branch 'pytorch_dlc' into niels/rtmpose
n-poulsen Dec 10, 2024
ba536a2
fix import for top_down_crop
n-poulsen Dec 12, 2024
4328068
Merge branch 'pytorch_dlc' into niels/rtmpose
n-poulsen Dec 13, 2024
9e075e9
crop sampling after affine aug
n-poulsen Dec 13, 2024
d187038
RTMPose helper method
n-poulsen Dec 13, 2024
ed403eb
bug fix: pose inference runner
n-poulsen Dec 13, 2024
7f6b092
Merge branch 'pytorch_dlc' into niels/rtmpose
MMathisLab Dec 14, 2024
279adba
Merge branch 'pytorch_dlc' into niels/rtmpose
n-poulsen Dec 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@ snapshot-*
# Modelzoo checkpoints
deeplabcut/modelzoo/checkpoints/

# PyTorch backbone weights
deeplabcut/pose_estimation_pytorch/models/backbones/pretrained_weights/

# Wandb files
wandb/

Expand Down
1 change: 1 addition & 0 deletions deeplabcut/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
"DLC loaded in light mode; you cannot use any GUI (labeling, relabeling and standalone GUI)"
)

from deeplabcut.core.engine import Engine
from deeplabcut.create_project import (
create_new_project,
create_new_project_3d,
Expand Down
74 changes: 62 additions & 12 deletions deeplabcut/pose_estimation_pytorch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -420,16 +420,20 @@ train(

### Running Video Analysis outside a DeepLabCut Project

DeepLabCut provides high-level APIs (via the GUI or the python package) to analyze your data. The usage of this API assumes the existance of a DLC project (with `config.yaml` file, etc.).
DeepLabCut provides high-level APIs (via the GUI or the python package) to analyze your
data. The usage of this API assumes the existance of a DLC project (with `config.yaml`
file, etc.).

Sometimes it might be more convenient to just run a model on your data via a low-level API. We also use this API under the hood, in particular for the Model Zoo. Check out the example below:
Sometimes it might be more convenient to just run a model on your data via a low-level
API. We also use this API under the hood, in particular for the Model Zoo. Check out the
example below:

```python
from pathlib import Path

from deeplabcut.pose_estimation_pytorch import Task
from deeplabcut.pose_estimation_pytorch.apis.analyze_videos import video_inference
from deeplabcut.pose_estimation_pytorch.config import read_config_as_dict
from deeplabcut.pose_estimation_pytorch.task import Task
from deeplabcut.pose_estimation_pytorch.apis.utils import get_inference_runners

train_dir = Path("/Users/Jaylen/my-dlc-models/train")
Expand All @@ -447,30 +451,76 @@ detector_batch_size = 8

# read model configuration
model_cfg = read_config_as_dict(pytorch_config_path)
bodyparts = model_cfg["metadata"]["bodyparts"]
unique_bodyparts = model_cfg["metadata"]["unique_bodyparts"]
with_identity = model_cfg["metadata"].get("with_identity", False)

pose_task = Task(model_cfg["method"])
pose_runner, detector_runner = get_inference_runners(
model_config=model_cfg,
snapshot_path=snapshot_path,
max_individuals=max_num_animals,
num_bodyparts=len(bodyparts),
num_unique_bodyparts=len(unique_bodyparts),
batch_size=batch_size,
with_identity=with_identity,
transform=None,
detector_batch_size=detector_batch_size,
detector_path=detector_snapshot_path,
detector_transform=None,
)

predictions = video_inference(
video=video_path,
task=pose_task,
pose_runner=pose_runner,
detector_runner=detector_runner,
with_identity=False,
)
```


### Running Top-Down Video Analysis with Existing Bounding Boxes

When `deeplabcut.pose_estimation_pytorch.apis.analyze_videos.video_inference` is called
with a top-down model, it is assumed that a detector snapshot is given as well to obtain
bounding boxes with which to run pose estimation. It's possible that you've already
obtained bounding boxes for your video (with another object detector or through some
other means), and you want to re-use those bounding boxes instead of running an object
detector again.

You can easily do so by writing a bit of custom code, as shown in the example below:

```python
from pathlib import Path

import numpy as np
from deeplabcut.pose_estimation_pytorch import get_inference_runners
from deeplabcut.pose_estimation_pytorch.apis import VideoIterator
from deeplabcut.pose_estimation_pytorch.config import read_config_as_dict
from tqdm import tqdm

# create an iterator for your video
video = VideoIterator("/Users/Jayson/my-cool-video.mp4")

# dummy bboxes - you can load yours from a file or in another way
# the bboxes should be in `xywh` format, i.e. (x_top_left, y_top_left, width, height)
bounding_boxes = [
dict( # frame 0 bounding boxes
bboxes=np.array([[12, 37, 120, 78]]),
),
dict( # frame 1 bounding boxes
bboxes=np.array([[17, 45, 128, 73], [532, 34, 117, 87]]),
),
# ...
dict( # frame N bboxes -> must be equal to the number of frames in the video!
bboxes=np.array([[17, 45, 128, 73], [532, 34, 117, 87]]),
),
]
video.set_context(bounding_boxes)
max_individuals = np.max([len(context["bboxes"]) for context in bounding_boxes])

# run inference!
model_cfg = read_config_as_dict("/Users/Jayson/pytorch_config.yaml")
pose_runner, _ = get_inference_runners(
model_config=model_cfg,
snapshot_path=Path("/Users/Jayson/model-snapshot.pt"),
max_individuals=max_individuals,
batch_size=32,
)

# your predictions will be a list, containing the predictions made for each frame
# as a dict (with keys for "bodyparts" but also "bboxes")!
predictions = pose_runner.inference(images=tqdm(video))
```
1 change: 1 addition & 0 deletions deeplabcut/pose_estimation_pytorch/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
from deeplabcut.pose_estimation_pytorch.apis import (
analyze_videos,
convert_detections2tracklets,
get_inference_runners,
evaluate_network,
extract_maps,
extract_save_all_maps,
Expand Down
22 changes: 19 additions & 3 deletions deeplabcut/pose_estimation_pytorch/apis/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,32 @@
#

from deeplabcut.pose_estimation_pytorch.apis.analyze_images import (
analyze_image_folder,
analyze_images,
superanimal_analyze_images,
)
from deeplabcut.pose_estimation_pytorch.apis.analyze_videos import analyze_videos
from deeplabcut.pose_estimation_pytorch.apis.analyze_videos import (
analyze_videos,
video_inference,
VideoIterator,
)
from deeplabcut.pose_estimation_pytorch.apis.convert_detections_to_tracklets import (
convert_detections2tracklets,
)
from deeplabcut.pose_estimation_pytorch.apis.evaluate import evaluate_network
from deeplabcut.pose_estimation_pytorch.apis.evaluate import (
evaluate,
evaluate_network,
)
from deeplabcut.pose_estimation_pytorch.apis.export import export_model
from deeplabcut.pose_estimation_pytorch.apis.train import train_network
from deeplabcut.pose_estimation_pytorch.apis.train import (
train,
train_network,
)
from deeplabcut.pose_estimation_pytorch.apis.utils import (
get_detector_inference_runner,
get_inference_runners,
get_pose_inference_runner,
)
from deeplabcut.pose_estimation_pytorch.apis.visualization import (
extract_maps,
extract_save_all_maps,
Expand Down
36 changes: 28 additions & 8 deletions deeplabcut/pose_estimation_pytorch/apis/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -423,9 +423,9 @@ def build_bboxes_dict_for_dataframe(
def get_inference_runners(
model_config: dict,
snapshot_path: str | Path,
max_individuals: int,
num_bodyparts: int,
num_unique_bodyparts: int,
max_individuals: int | None = None,
num_bodyparts: int | None = None,
num_unique_bodyparts: int | None = None,
batch_size: int = 1,
device: str | None = None,
with_identity: bool = False,
Expand All @@ -439,9 +439,12 @@ def get_inference_runners(
Args:
model_config: the pytorch configuration file
snapshot_path: the path of the snapshot from which to load the weights
max_individuals: the maximum number of individuals per image
num_bodyparts: the number of bodyparts predicted by the model
num_unique_bodyparts: the number of unique_bodyparts predicted by the model
max_individuals: the maximum number of individuals per image (if None, uses the
individuals defined in the model_config metadata)
num_bodyparts: the number of bodyparts predicted by the model (if None, uses the
bodyparts defined in the model_config metadata)
num_unique_bodyparts: the number of unique_bodyparts predicted by the model (if
None, uses the unique bodyparts defined in the model_config metadata)
batch_size: the batch size to use for the pose model.
with_identity: whether the pose model has an identity head
device: if defined, overwrites the device selection from the model config
Expand All @@ -457,6 +460,13 @@ def get_inference_runners(
a runner for pose estimation
a runner for detection, if detector_path is not None
"""
if max_individuals is None:
max_individuals = len(model_config["metadata"]["individuals"])
if num_bodyparts is None:
num_bodyparts = len(model_config["metadata"]["bodyparts"])
if num_unique_bodyparts is None:
num_unique_bodyparts = len(model_config["metadata"]["unique_bodyparts"])

pose_task = Task(model_config["method"])
if device is None:
device = resolve_device(model_config)
Expand All @@ -482,10 +492,15 @@ def get_inference_runners(
if device == "mps":
detector_device = "cpu"

crop_cfg = model_config["data"]["inference"].get("top_down_crop", {})
width, height = crop_cfg.get("width", 256), crop_cfg.get("height", 256)
margin = crop_cfg.get("margin", 0)

pose_preprocessor = build_top_down_preprocessor(
color_mode=model_config["data"]["colormode"],
transform=transform,
cropped_image_size=(256, 256),
top_down_crop_size=(width, height),
top_down_crop_margin=margin,
)
pose_postprocessor = build_top_down_postprocessor(
max_individuals=max_individuals,
Expand Down Expand Up @@ -636,10 +651,15 @@ def get_pose_inference_runner(
with_identity=with_identity,
)
else:
crop_cfg = model_config["data"]["inference"].get("top_down_crop", {})
width, height = crop_cfg.get("width", 256), crop_cfg.get("height", 256)
margin = crop_cfg.get("margin", 0)

pose_preprocessor = build_top_down_preprocessor(
color_mode=model_config["data"]["colormode"],
transform=transform,
cropped_image_size=(256, 256),
top_down_crop_size=(width, height),
top_down_crop_margin=margin,
)
pose_postprocessor = build_top_down_postprocessor(
max_individuals=max_individuals,
Expand Down
19 changes: 19 additions & 0 deletions deeplabcut/pose_estimation_pytorch/config/backbones/cspnext_m.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
model:
backbone:
type: CSPNeXt
model_name: cspnext_m
freeze_bn_stats: false
freeze_bn_weights: false
deepen_factor: 0.67
widen_factor: 0.75
backbone_output_channels: 768
runner:
optimizer:
type: AdamW
params:
lr: 0.0005
scheduler:
type: LRListScheduler
params:
lr_list: [ [ 1e-4 ], [ 1e-5 ] ]
milestones: [ 90, 190 ]
19 changes: 19 additions & 0 deletions deeplabcut/pose_estimation_pytorch/config/backbones/cspnext_s.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
model:
backbone:
type: CSPNeXt
model_name: cspnext_s
freeze_bn_stats: false
freeze_bn_weights: false
deepen_factor: 0.33
widen_factor: 0.5
backbone_output_channels: 512
runner:
optimizer:
type: AdamW
params:
lr: 0.0005
scheduler:
type: LRListScheduler
params:
lr_list: [ [ 1e-4 ], [ 1e-5 ] ]
milestones: [ 90, 190 ]
19 changes: 19 additions & 0 deletions deeplabcut/pose_estimation_pytorch/config/backbones/cspnext_x.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
model:
backbone:
type: CSPNeXt
model_name: cspnext_x
freeze_bn_stats: false
freeze_bn_weights: false
deepen_factor: 1.33
widen_factor: 1.25
backbone_output_channels: 1280
runner:
optimizer:
type: AdamW
params:
lr: 0.0005
scheduler:
type: LRListScheduler
params:
lr_list: [ [ 1e-4 ], [ 1e-5 ] ]
milestones: [ 90, 190 ]
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
colormode: RGB
inference:
normalize_images: true
top_down_crop:
width: 256
height: 256
train:
affine:
p: 0.5
Expand All @@ -13,3 +16,6 @@ train:
hist_eq: false
motion_blur: false
normalize_images: true
top_down_crop:
width: 256
height: 256
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,6 @@ runner:
train_settings:
batch_size: 1
dataloader_workers: 0
dataloader_pin_memory: true
dataloader_pin_memory: false
display_iters: 500
epochs: 250
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ model:
heatmap_config:
channels:
- 270
- 64
- 18
- "num_bodyparts + 1" # num_bodyparts + center keypoint
num_blocks: 1
dilation_rate: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ model:
heatmap_config:
channels:
- 480
- 64
- 32
- "num_bodyparts + 1" # num_bodyparts + center keypoint
num_blocks: 1
dilation_rate: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ model:
heatmap_config:
channels:
- 720
- 64 # TODO: Check channels
- 48
- "num_bodyparts + 1" # num_bodyparts + center keypoint
num_blocks: 1
dilation_rate: 1
Expand Down
Loading