A curated list of awesome human-human interaction (HHI) resources. If you find any errors or problems, please don't hesitate to comment.
| Dataset | Year | Motions | Frames | Texts | Scheme | Modality |
|---|---|---|---|---|---|---|
| UMPM | 2011 | 36 | 400K | No | MoCap | Skeleton |
| SBU Kinect | 2012 | 300 | 7.5K | No | RGB+D | Skeleton |
| You2Me | 2020 | 42 | 77K | No | RGB+D | Skeleton |
| NTU RGB+D 120 | 2019 | 8,276 | 462K | No | RGB+D | Skeleton |
| Chi3D | 2020 | 373 | 63K | No | MoCap | SMPL-X |
| ExPI | 2022 | 115 | 30K | No | Multi RGB | Skeleton |
| GTA Combat | 2023 | 6.8K | 2.05M | No | Synthetic | Skeleton |
| Hi4D | 2023 | 100 | 11K | No | Multi RGB | SMPL |
| InterHuman | 2023 | 6,022 | 1.7M | Yes | Multi RGB | SMPL |
| Inter-X | 2024 | 11,388 | 8.1M | Yes | MoCap | SMPL-X, Skeleton |
| ReMoCap | 2024 | 87 | 275.7K | No | Multi RGB | Skeleton |
| InterDance | 2025 | - | 3.93hours | No | MoCap | SMPL-X |
| Embody 3D | 2025 | - | 500hours | Yes | Multi RGB | SMPL-X |
-
Stability-Driven Motion Generation for Object-Guided Human-Human Co-Manipulation, arXiv'26, [Paper], [Project]
-
HINT: Hierarchical Interaction Modeling for Autoregressive Multi-Human Motion Generation, arXiv'26, [Paper]
-
Interact2Ar: Full-Body Human-Human Interaction Generation via Autoregressive Diffusion Models, CVPR'26, [Paper], [Project]
-
Diffusion Forcing for Multi-Agent Interaction Sequence Modeling, arXiv'25 [Paper], [Code]
-
InterMoE: Individual-Specific 3D Human Interaction Generation via Dynamic Temporal-Selective MoE, AAAI'26 [Paper], [Code]
-
Disentangled Hierarchical VAE for 3D Human-Human Interaction Generation, ICLR'26 Submission [Paper]
-
CODA: Commonsense-Driven Autoregressive Human Interaction Generation, ICLR'26 Submission [Paper]
-
Fine-grained text-driven dual-human motion generation via dynamic hierarchical interaction, arXiv'25 [Paper]
-
InterAct: A Large-Scale Dataset of Dynamic, Expressive and Interactive Activities between Two People in Daily Scenarios, SCA'25 [Paper], [Project]
-
Ponimator: Unfolding Interactive Pose for Versatile Human-Human Interaction Animation, ICCV'25 [Paper], [Project], [Code]
-
Perceiving and Acting in First-Person: A Dataset and Benchmark for Egocentric Human-Object-Human Interactions, ICCV'25 [Paper], [Project]
-
Towards Immersive Human-X Interaction: A Real-Time Framework for Physically Plausible Motion Synthesis, ICCV'25 [Paper], [Project]
-
MDD: A Dataset for Text-and-Music Conditioned Duet Dance Generation, ICCV'25 [Paper], [Project]
-
DuetGen: Music Driven Two-Person Dance Generation via Hierarchical Masked Modeling, SIGGRAPH'25 [Paper], [Project], [Code]
-
PhysiInter: Integrating Physical Mapping for High-Fidelity Human Interaction Generation, arXiv'25 [Paper]
-
InterMamba: Efficient Human-Human Interaction Generation with Adaptive Spatio-Temporal Mamba, arXiv'25 [Paper]
-
Dyadic Mamba: Long-term Dyadic Human Motion Synthesis, CVPRW'25 [Paper]
-
InterMask: 3D Human Interaction Generation via Collaborative Masked Modeling, ICLR'25 [Paper], [Project], [Code]
-
TIMotion: Temporal and Interactive Framework for Efficient Human-Human Motion Generation, CVPR'25 [Paper], [Project], [Code]
-
MixerMDM: Learnable Composition of Human Motion Diffusion Models, CVPR'25 [Paper], [Project], [Code]
-
Invisible Strings: Revealing Latent Dancer-to-Dancer Interactions with Graph Neural Networks, ICCC'25 [Paper], [Code]
-
Leader and Follower: Interactive Motion Generation under Trajectory Constraints, arXiv'25 [Paper]
-
InterDance: Reactive 3D Dance Generation with Realistic Duet Interactions, arXiv'25 [Paper], [Code]
-
Two-in-One: Unified Multi-Person Interactive Motion Generation by Latent Diffusion Transformer, arXiv'25 [Paper]
-
It Takes Two: Real-time Co-Speech Two-person’s Interaction Generation via Reactive Auto-regressive Diffusion Model, arXiv'25 [Paper]
-
COLLAGE: Collaborative Human-Agent Interaction Generation using Hierarchical Latent Diffusion and Language Models, arXiv'25 [Paper]
-
InterControl: Zero-shot Human Interaction Generation by Controlling Every Joint, NeurIPS'24 [Paper], [Code]
-
Towards Open Domain Text-Driven Synthesis of Multi-Person Motions, ECCV'24 [Paper], [Project]
-
in2IN: Leveraging individual Information to Generate Human INteractions, CVPRW'24 [Paper], [Project], [Code]
-
CORE4D: A 4D Human-Object-Human Interaction Dataset for Collaborative Object REarrangement, arXiv 2024.06 [Paper], [Project], [Code]
-
Inter-X: Towards Versatile Human-Human Interaction Analysis, CVPR'24 [Paper], [Project], [Code&Data]
-
ContactGen: Contact-Guided Interactive 3D Human Generation for Partners, AAAI'24 [Paper], [Project], [Code]
-
InterGen: Diffusion-based Multi-human Motion Generation under Complex Interactions, IJCV'24 [Paper], [Project], [Code&Data]
-
ActFormer: A GAN-based Transformer towards General Action-Conditioned 3D Human Motion Generation, ICCV'23 [Paper], [Project], [Code]
-
Duolando: Follower GPT with Off-Policy Reinforcement Learning to Dance Accompaniment, ICLR'24 [Paper], [Project], [Code], [Data]
-
PriorMDM: Human Motion Diffusion as a Generative Prior, ICLR'24 [Paper], [Project], [Code]
-
Neural Animation Layering for Synthesizing Martial Arts Movements, SIGGRAPH'21 [Paper]
-
Local Motion Phases for Learning Multi-Contact Character Movements, SIGGRAPH'20 [Paper], [Code]
-
ReMoGen: Real-time Human Interaction-to-Reaction Generation via Modular Learning from Diverse Data, CVPR'26, [Paper], [Project]
-
EgoReAct: Egocentric Video-Driven 3D Human Reaction Generation, arXiv'25, [Paper], [Project]
-
ReactMotion: Generating Reactive Listener Motions from Speaker Utterance, arXiv'26, [Paper], [Project]
-
ARMFlow: AutoRegressive MeanFlow for Online 3D Human Reaction Generation, arXiv'25 [Paper]
-
ReactionMamba: Generating Short & Long Human Reaction Sequences, arXiv'25 [Paper]
-
Uni-Inter: Unifying 3D Human Motion Synthesis Across Diverse Interaction Contexts, arXiv'25 [Paper]
-
MoReact: Generating Reactive Motion from Textual Descriptions, arXiv'25 [Paper]
-
Reactffusion: Physical Contact-guided Diffusion Model for Reaction Generation, arXiv'25 [Paper]
-
Real-time and Controllable Reactive Motion Synthesis via Intention Guidance, arXiv'25 [Paper]
-
MARRS: Masked Autoregressive Unit-based Reaction Synthesis, arXiv'25 [Paper]
-
E-React: Towards Emotionally Controlled Synthesis of Human Reactions, arXiv'25 [Paper], [Project]
-
ReactDance: Progressive-Granular Representation for Long-Term Coherent Reactive Dance Generation, arXiv'25 [Paper]
-
HERO: Human Reaction Generation from Videos, arXiv'25 [Paper], [Project]
-
Think Then React: Towards Unconstrained Action-to-Reaction Motion Generation, ICLR'25 [Paper], [Project]
-
Ready-to-React: Online Reaction Policy for Two-Character Interaction Generation, ICLR'25 [Paper], [Project], [Code]
-
Interactive Humanoid: Online Full-Body Motion Reaction Synthesis with Social Affordance Canonicalization and Forecasting, 3DV'25 [Paper], [Project]
-
MARRS: Masked Autoregressive Unit-based Reaction Synthesis, arXiv'25 [Paper]
-
SocialGen: Modeling Multi-Human Social Interaction with Language Models, arXiv'25 [Paper], [Project]
-
ARFlow: Human Action-Reaction Flow Matching with Physical Guidance, arXiv'25 [Paper], [Project]
-
PhysReaction: Physically Plausible Real-Time Humanoid Reaction Synthesis via Forward Dynamics Guided 4D Imitation, arXiv 2024.04 [Paper], [Project]
-
Inter-X: Towards Versatile Human-Human Interaction Analysis, CVPR'24 [Paper], [Project], [Code&Data]
-
ReGenNet: Towards Human Action-Reaction Synthesis, CVPR'24 [Paper], [Project], [Code]
-
Role-aware Interaction Generation from Textual Description, ICCV'23 [Paper], [Code]
-
ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions, ECCV'24 [Paper], [Project]
-
Interaction transformer for human reaction generation, TMM'23 [Paper]
-
MAMMA: Markerless & Automatic Multi-Person Motion Action Capture, CVPR'26 [Paper]
-
Reconstructing Close Human Interaction with Appearance and Proxemics Reasoning, CVPR'25 [Paper]
-
Harmony4D: A Video Dataset for In-The-Wild Close Human Interactions, NeurIPS'24 [Paper], [Project], [Data]
-
AvatarPose: Avatar-guided 3D Pose Estimation of Close Human Interaction from Sparse Multi-view Videos, ECCV'24 [Paper], [Project]
-
Capturing Closely Interacted Two-Person Motions with Reaction Priors, CVPR'24 [Paper], [Project]
-
Pose Priors from Language Models, arXiv 2024.05 [Paper]
-
MultiPhys: Multi-Person Physics-aware 3D Motion Estimation, CVPR'24 [Paper], [Project], [Code]
-
Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption, CVPR'24 [Paper], [Code]
-
Reconstructing Close Human Interactions from Multiple Views, SIGGRAPH Asia'23 [Paper], [Code]
-
Hi4D: 4D Instance Segmentation of Close Human Interaction, CVPR'23 [Paper], [Project], [Code&Data]
-
Multi-Person Extreme Motion Prediction, CVPR'22 [Paper], [Project], [Code]
-
Multiple Human Motion Understanding, AAAI'26, [Paper]
-
SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos, CVPR'24 [Paper], [Code]
-
Inter-X: Towards Versatile Human-Human Interaction Analysis, CVPR'24 [Paper], [Project], [Code&Data]
-
IGFormer: Interaction Graph Transformer for Skeleton-based Human Interaction Recognition, ECCV'22 [Paper]
-
Human-to-Human Interaction Detection, arXiv 2023.07 [Paper]
-
InterAct: A Large-Scale Dataset of Dynamic, Expressive and Interactive Activities between Two People in Daily Scenarios, SCA'25 [Paper], [Project]
-
HHI-Assist: A Dataset and Benchmark of Human-Human Interaction in Physical Assistance Scenario, RA-L'25 [Paper], [Project], [Code]
-
InterDance: Reactive 3D Dance Generation with Realistic Duet Interactions, arXiv'25 [Paper], [Code]
-
Inter-X: Towards Versatile Human-Human Interaction Analysis, CVPR'24 [Paper], [Project], [Code&Data]
-
SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos, CVPR'24 [Paper], [Code]
-
ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions, ECCV'24 [Paper], [Project]
-
InterGen: Diffusion-based Multi-human Motion Generation under Complex Interactions, IJCV'24 [Paper], [Project], [Code&Data]
-
Hi4D: 4D Instance Segmentation of Close Human Interaction, CVPR'23 [Paper], [Project], [Code&Data]
-
ActFormer: A GAN-based Transformer towards General Action-Conditioned 3D Human Motion Generation, ICCV'23 [Paper], [Project], [Code]
-
Multi-Person Extreme Motion Prediction, CVPR'22 [Paper], [Project], [Code]
-
Three-dimensional Reconstruction of Human Interactions, CVPR'20 [Paper], [Project]
-
Unified Number-Free Text-to-Motion Generation Via Flow Matching, CVPR'26, [Paper], [Project]
-
Large-Scale Multi-Character Interaction Synthesis, SIGGRAPH'25 [Paper], [Project]
-
PINO: Person-Interaction Noise Optimization for Long-Duration and Customizable Motion Generation of Arbitrary-Sized Groups, ICCV'25 [Paper], [Project], [Code]
-
Multi-Person Interaction Generation from Two-Person Motion Priors, SIGGRAPH'25 [Paper], [Project], [Code]
-
SocialGen: Modeling Multi-Human Social Interaction with Language Models, arXiv'25 [Paper], [Project]
-
Towards Open Domain Text-Driven Synthesis of Multi-Person Motions, ECCV'24 [Paper], [Project]
-
Stochastic Multi-Person 3D Motion Forecasting, ICLR'23 [Paper], [Project], [Code]
-
Joint-Relation Transformer for Multi-Person Motion Prediction, ICCV'23 [Paper], [Code]
-
Learning Whole-Body Human-Humanoid Interaction from Human-Human Demonstrations, arXiv'25, [Paper]
-
SymBridge: A Human-in-the-Loop Cyber-Physical Interactive System for Adaptive Human-Robot Symbiosis, arXiv'25 [Paper]
-
It Takes Two: Learning Interactive Whole-Body Control Between Humanoid Robots, arXiv'25 [Paper], [Code]