Affiliation:
1. University of Chinese Academy of Science China
2. Institute of Automation, Chinese Academy of Sciences China
3. Institute of Computing Technology, Chinese Academy of Sciences China
4. Beijing University of Posts and Telecommunications China
5. Cardiff University United Kingdom
Abstract
AbstractIn physics‐based cloth animation, rich folds and detailed wrinkles are achieved at the cost of expensive computational resources and huge labor tuning. Data‐driven techniques make efforts to reduce the computation significantly by utilizing a preprocessed database. One type of methods relies on human poses to synthesize fitted garments, but these methods cannot be applied to general cloth animations. Another type of methods adds details to the coarse meshes obtained through simulation, which does not have such restrictions. However, existing works usually utilize coordinate‐based representations which cannot cope with large‐scale deformation, and requires dense vertex correspondences between coarse and fine meshes. Moreover, as such methods only add details, they require coarse meshes to be sufficiently close to fine meshes, which can be either impossible, or require unrealistic constraints to be applied when generating fine meshes. To address these challenges, we develop a temporally and spatially as‐consistent‐as‐possible deformation representation (named TS‐ACAP) and design a DeformTransformer network to learn the mapping from low‐resolution meshes to ones with fine details. This TS‐ACAP representation is designed to ensure both spatial and temporal consistency for sequential large‐scale deformations from cloth animations. With this TS‐ACAP representation, our DeformTransformer network first utilizes two mesh‐based encoders to extract the coarse and fine features using shared convolutional kernels, respectively. To transduct the coarse features to the fine ones, we leverage the spatial and temporal Transformer network that consists of vertex‐level and frame‐level attention mechanisms to ensure detail enhancement and temporal coherence of the prediction. Experimental results show that our method is able to produce reliable and realistic animations in various datasets at high frame rates with superior detail synthesis abilities compared to existing methods.
Funder
National Natural Science Foundation of China
Natural Science Foundation of Beijing Municipality
H2020 LEIT Information and Communication Technologies
Center for Africana Studies, Johns Hopkins University
Subject
Computer Graphics and Computer-Aided Design
Reference78 articles.
1. BahdanauD. ChoK. BengioY.: Neural machine translation by jointly learning to align and translate.arXiv preprint arXiv:1409.0473(2014). 3 5
2. BerticheH. MadadiM. EscaleraS.: PBNS: physically based neural simulator for unsupervised garment pose space deformation.arXiv preprint arXiv:2012.11310(2020). 2 3
3. BridsonR. MarinoS. FedkiwR.: Simulation of clothing with folds and wrinkles. InProc. Symp. Computer Animation(2003) pp.28–36. 2
4. Learning shape correspondence with anisotropic convolutional neural networks;Boscaini D.;Advances in Neural Information Processing Systems,2016
5. Learning long-term dependencies with gradient descent is difficult