Riemannian Motion Generation

Riemannian Motion Generation

A Unified Framework for Motion Representation and Generation via Riemannian Flow Matching

Fangran Miao1, Jian Huang1, Ting Li2
1PolyU, 2SUSTech

Code and model weights will be coming soon!

Left: Illustration of the unified Riemannian representation for articulated motion. Each motion frame can be factorized into global translation \((\mathcal{M}_{\mathcal{T}})\), global orientation and per-joint rotations \((\mathcal{M}_{\mathcal{R}})\), and local pose \((\mathcal{M}_{\mathcal{P}})\) along with the temporal differences \((T\mathcal{M}_{\mathcal{F}}\ \text{for}\ \mathcal{F}\in\{\mathcal{T},\mathcal{R},\mathcal{P}\})\).

Right: Illustration of the Riemannian flow matching process in the RMG manifold. \(\mathcal{M}\) is defined by our proposed manifold \(\mathcal{M}_{\mathrm{RMG}}=\mathbb{R}^3\times (\mathbb{S}^3)^J\). The red line is the geodesic between \(\mathbf{x}_0\) and \(\mathbf{x}_1\) while the yellow line with arrow is the velocity at \(\mathbf{x}_t\).

Abstract

Human motion generation is often learned in Euclidean spaces, although valid motions follow structured non-Euclidean geometry. We present , a unified framework that represents motion on a product manifold and learns dynamics via Riemannian flow matching. RMG factorizes motion into several manifold factors, yielding a scale-free representation with intrinsic normalization, and uses geodesic interpolation, tangent-space supervision, and manifold-preserving ODE integration for training and sampling. On HumanML3D, RMG achieves state-of-the-art FID in the HumanML3D format (0.043) and ranks first on all reported metrics under the MotionStreamer format. On MotionMillion, it also surpasses strong baselines (FID 5.6, R@1 0.86). Ablations show that the compact \(\mathcal{T}+\mathcal{R}\) (translation + rotations) representation is the most stable and effective, highlighting geometry-aware modeling as a practical and scalable route to high-fidelity motion generation.

Motion Representation Comparison

Showcase Videos

Same Prompts, Diverse Motions

A person stands on one legs in yoga pose

A man performs a standing back kick

The person does a salsa dance

Results

Text-based Motion Generation on HumanML3D

Text-based Motion Generation on MotionMillion

BibTeX

@misc{rmg,
      title={Riemannian Motion Generation: A Unified Framework for Human Motion Representation and Generation via Riemannian Flow Matching}, 
      author={Fangran Miao and Jian Huang and Ting Li},
      year={2026},
      eprint={2603.15016},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.15016}, 
}