1. Abe, N., & Warmuth, M. K. (1992). On the computational complexity of approximating distributions by probabilistic automata. Machine Learning, 9, 205–260.
2. Asuncion, A. U., Welling, M., Smyth, P., & Teh, Y. W. (2009). On smoothing and inference for topic models. In Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence (pp. 27–34).
3. Balle, B., Castro, J., & Gavaldà, R. (2013). Learning probabilistic automata: a study in state distinguishability. Theoretical Computer Science, 473, 46–60.
4. Beal, M. J. (2003). Variational algorithms for approximate Bayesian inference. PhD thesis, Gatsby Computational Neuroscience Unit, University College London.
5. Bengio, Y., & Grandvalet, Y. (2004). No unbiased estimator of the variance of K-fold cross-validation. Journal of Machine Learning Research, 5, 1089–1105.