Author:
Sam Deepak Babu,Sajjan Neeraj N,Maurya Himanshu,Babu R. Venkatesh
Abstract
We present an unsupervised learning method for dense crowd count estimation. Marred by large variability in appearance of people and extreme overlap in crowds, enumerating people proves to be a difficult task even for humans. This implies creating large-scale annotated crowd data is expensive and directly takes a toll on the performance of existing CNN based counting models on account of small datasets. Motivated by these challenges, we develop Grid Winner-Take-All (GWTA) autoencoder to learn several layers of useful filters from unlabeled crowd images. Our GWTA approach divides a convolution layer spatially into a grid of cells. Within each cell, only the maximally activated neuron is allowed to update the filter. Almost 99.9% of the parameters of the proposed model are trained without any labeled data while the rest 0.1% are tuned with supervision. The model achieves superior results compared to other unsupervised methods and stays reasonably close to the accuracy of supervised baseline. Furthermore, we present comparisons and analyses regarding the quality of learned features across various models.
Publisher
Association for the Advancement of Artificial Intelligence (AAAI)
Cited by
40 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献