References
Bayesian statistics and information theory
For an introduction to Bayesian statistics, information theory, and Markov chain Monte Carlo (MCMC), David MacKay's "Information Theory, Inference and Learning Algorithms" 1 is an excellent choice and it's available for free.
MacKay, D. J., & Mac Kay, D. J. (2003). Information theory, inference and learning algorithms. Cambridge university press. (PDF)
Dirichlet process mixture models
For an introduction to infinite mixture models via the Dirichlet process, Carl Rasumssen's "The infinite Gaussian mixture model"2 provides an introduction to the model; and Radford Neal's "Markov chain sampling methods for Dirichlet process mixture models"3 provides an introduction to basic MCMC methods. When I was learning Dirichlet process mixture models, I found Frank Wood and Michael Black's "A nonparametric Bayesian alternative to spike sorting" 4 extremely helpful. Because its target audience is applied scientists it lays things out more simply and completely than a manuscript aimed at statisticians or computer scientists might.
Rasmussen, C. (1999). The infinite Gaussian mixture model. Advances in neural information processing systems, 12. (PDF)
Neal, R. M. (2000). Markov chain sampling methods for Dirichlet process mixture models. Journal of computational and graphical statistics, 9(2), 249-265. (PDF)
Wood, F., & Black, M. J. (2008). A nonparametric Bayesian alternative to spike sorting. Journal of neuroscience methods, 173(1), 1-12. (PDF)
Probabilistic cross-categorization (PCC)
For a compact explanation designed for people unfamiliar with Bayesian statistics, see Shafto, et al 5. This work is targeted at psychologists and demonstrates PCC's power to model human cognitive capabilities. For a incredibly in-dept overview with loads of math, use cases, and examples, see Mansinghka et al 6.
Shafto, P., Kemp, C., Mansinghka, V., & Tenenbaum, J. B. (2011). A probabilistic model of cross-categorization. Cognition, 120(1), 1-25.(PDF)
Mansinghka, V., Shafto, P., Jonas, E., Petschulat, C., Gasner, M., & Tenenbaum, J. B. (2016). Crosscat: A fully bayesian nonparametric method for analyzing heterogeneous, high dimensional data. (PDF)