1. P. Schwarz, P. Matejka, L. Burget, and O. Glembek, "Phoneme recognizer based on long temporal context," Speech Processing Group, Faculty of Information Technology, Brno University of Technology.[Online]. Available: http://speech.fit.vutbr.cz/en/software, 2006.
2. H. Hermansky and S. Sharma, "Temporal patterns (TRAPs) in ASR of noisy speech," in IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99, vol. 1, pp. 289--292, March 1999.
3. A. Mohamed, G. E. Dahl, and G. Hinton, "Acoustic modeling using deep belief networks," Transsactions on Audio, Speech and Language Processing, vol. 20, pp. 14--22, January 2012.
4. G. E. Dahl, D. Yu, L. Deng, and A. Acero, "Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition," IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, no. 1, pp. 30--42, 2012.
5. B. Kingsbury, T. N. Sainath, and H. Soltau, "Scalable minimum bayes risk training of deep neural network acoustic models using distributed hessian-free optimization," in 13th Annual Conference of the International Speech Communication Association (InterSpeech 2012), pp. 10--13, ISCA, September 2012.