Human4d

Human4d

Tasks

From Human4d
Jump to navigation Jump to search

With most existing architectures and algorithms having developed for 2D images, their adaptation to 3D data (point clouds or meshes) is less obvious, where a regular structure is not directly available. While DL CNNs have been used in some 3D contexts, e.g. face modeling or shape classification, their interest in live modeling of complex, articulated shapes like human body has not yet been fully explored, and extending the learning ability to 4D context remains an unexplored area. Not surprisingly, the ability to model, analyze, and synthesize 4D (3D+time) human models as fully dynamic models is limited in the current techniques, where geometry and appearance are modeled with static poses in a frame-by-frame manner.

Human4D project takes up new challenges in the context of flagship human body modeling, aiming at a new, efficient 4D shape modelling of human body under motion. Our ambition is to go beyond existing shape space representations that mostly focus on static shape poses and seldom consider the continuous dynamic of body shapes.

In order to achieve the goal, Human4D is articulated with four main objectives:

Overview.jpg

Task 1: 4D Human data acquisition: Human4D proposes to build new datasets of moving human shapes and to study new strategies to acquire and synthesize new data by exploiting existing dataset. In order to account for the variability in human shapes and human motions, we aim at acquiring hundreds of human body shapes under motion with diverse physical characteristics. Our dynamic shape capture sessions will be based on the Kinovis platform, a unique acquisition platform of 10m×8m area surrounded by 68 color cameras and 20 mocap cameras operating at INRIA Grenoble. This platform allows for large motions such as running, which guarantees the project its access to unprecedented dynamic observations on human bodies.

Task 2: A new dynamic (4D) human shape representation: We aim at developing new representations for shapes and their time evolutions in order to enable: time super-resolution to increase precision in the modeling; learning over 3D evolving shape structures; modeling the non-linearity of shapes over subject and motion variabilities; compact 4D spectral representations. There are at least three possible approaches that we will investigate to push the limit of current representations that model shapes but seldom their dynamics: One consists of 3D graphs that enable local dynamic properties to be captured and learned, following our recent success. Another is nonlinear manifold representation. The advantage of representing dynamic data on a non-linear manifold is a compact encoding of the constraint and distance measures that are in general superior to ones from Euclidean space. Furthermore, statistical modeling on manifold are in general better than in Euclidean space. Finally, spectral shape analysis, which relies on the study of the eigenvectors of specifically defined mesh operators, needs adaptation since the representation is domain-dependent. With local analysis that has been successfully used to generalize convolutional neural networks, we set our goal to extend Geometry Deep Learning to time-varying domains.

Task 3: 4D atlas construction: representations of multiple datasets of 4D data: The objective here is to construct 4D human atlas which models the commonality and variability over motions and individuals, based on compact representations of multiple datasets of 4D data that are in correspondence in spatiotemporal sense. Individual 4D shapes in the aforementioned representation will be integrated to form a collective representation. Significant number of works consider shape spaces that characterize the configurations of a given set of points, the vertices of a mesh for instance in facial modeling. Such 'shape spaces' also relate to many of current body modeling techniques that model body poses along with mesh representations. They can either be learned or defined a priori, and are used to constrain mesh deformations when creating realistic animations or estimating identity shape and poses from images.

Considering that the shape has a manifold structure, previous works propose manifold frameworks for human shape representation based on Lie group of deformations. Though they can decouple the identity shape from the pose, the latter is always modeled from a sparse sample data, and the temporal evolution of shape is merely modeled, leaving a limitation in modeling the essential variability or distance among multiple data shapes. Another main challenge in modelling the shape sequence data as stochastic processes on shape spaces is the nonlinearity of shape spaces and that of temporal evolutions of shapes. With these considered, we will develop an extension of nonlinear manifold learning to a collection of 4D data. As well, the possibility of learning the atlas implicitly through deep neural network will be investigated.

Task 4: Representative applications: Based on the constructed 4D human atlas (WP3), we will develop illustrative applications exploiting the 4D atlases’ ability on learning and inference. They will demonstrate prediction methods that, based on the 4D human atlas, are able to analyze, recover, and synthesize person- and motion-specific shape changes, from incomplete/partial data.