Sergey Nikolenko![Sergey Nikolenko](sergey150_8.jpg) Main page
print ' Books';
print ' Research papers';
print ' Talks and posters';
print ' Students';
print ' Popular science';
print ' Other stuff';
print ' Research';
print ' CS and crypto';
print ' Bioinformatics';
print ' Machine learning';
print ' Algebraic geometry';
print ' Algebra';
print ' Bayesian networks';
print ' Earth sciences';
print ' Teaching';
print ' 2014';
print ' ML, KFU';
print ' Game Theory, HSE';
print ' Mech. Design, HSE';
print ' ML, CSClub Kazan';
print ' Game theory, HSE';
print ' Math. logic, AU';
print ' Machine learning, STC';
print ' Machine learning, AU';
print ' 2013';
print ' Discrete math, HSE';
print ' Machine learning, STC';
print ' Math. logic, AU';
print ' Cryptography, AU';
print ' 2012';
print ' Machine learning, STC';
print ' Math. logic, AU';
print ' Machine learning II, AU';
print ' Machine learning, AU';
print ' Machine learning, EMC';
print ' 2011';
print ' Cryptography, AU';
print ' Math. logic, AU';
print ' Machine learning, AU';
print ' 2010';
print ' Math. logic, AU';
print ' Machine learning, AU';
print ' Cryptography, AU';
print ' 2009';
print ' Crypto in CS Club';
print ' Statistics';
print ' Machine learning, AU';
print ' Cryptography';
print ' 2008';
print ' Speech recognition';
print ' MD for CS Club';
print ' ML for CS Club';
print ' Mechanism design';
print ' 2007';
print ' Machine Learning';
print ' Probabilistic learning';
print ' External links';
print ' Google Scholar profile';
print ' DBLP profile';
print ' LiveJournal account
nikolenko (in Russian) | ';
print '![](spacer.gif) | ';
?>
Teaching activities |
Machine Learning at the Kazan Federal University, 2014
This is a semester-long machine learning course presented at the
Kazan Federal University
with financial aid from the Dynasty Foundation; see also the
course page at the CSClub website.
The course itself (all slides and lecture notes are in Russian):
- 1. Introduction. History of AI. Probability theory basics. Bayes' theorem and maximal a posteriori hypotheses.
- Slides ()
- 2. Probability distributions. Bernoulli trials. Maximum likelihood, ML estimates for Bernoulli trials and multinomial distribution. Prior distributions, conjugate priors. Beta distribution as a conjugate prior for Bernoulli trials. Predictive distribution: Laplace's rule. Dirichlet distribution as a conjugate prior for multinomial distributions.
- 3. Gaussian distribution. Maximum likelihood estimates for the Gaussian; why the ML estimate for variance is biased. Multidimensional Gaussian. Conditional and marginal Gaussians.
- Slides for lectures 2-3 ()
- 4. Least squares regression. Least squares as an ML estimate for Gaussian noise.
- Slides ()
- 5. Overfitting. Regularization. Ridge regression and lasso regression. Predictive distribution for linear regression. Classification: 1-of-K representation, linear decision functions. Fischer's linear discriminant.
- Slides ()
- 6. Bayes theorem for classification. LDA and QDA. Logistic regression.
- Slides ()
- 7. Statistical decision theory. Regression function, optimal Bayesian classifier. Nearest neighbors. Curse of dimensionality. Bias-variance-noise decomposition.
- Slides ()
- 8. Reinforcement learning: multiarmed bandits. Greedy policies, exploration vs. exploitation. Confidence intervals. Minimizing regret: UCB1.
- Slides ()
- 9. Reinforcement learning: Markov decision processes. On-policy and off-policy learning. TD-learning. Machine learning in games (backgammon, chess, go).
- Slides ()
- 10. Clustering. Hierarchical clustering, graph-based clustering. The EM algorithm. EM in general, minorization-maximization, why EM improves the likelihood. EM for clustering.
- Slides ()
- 11. Hidden Markov models. Baum-Welch algorithm. Applications of hidden Markov models to speech recognition.
- Slides ()
- 12. Probabilistic graphical models: basic idea, factorizations, d-separation. Directed and undirected models. Factor graphs.
- Slides ()
- 13. Inference on factor graphs. Belief propagation with the message passing algorithm.
- Slides ()
- 14. Case study: Bayesian rating systems. Bradley–Terry models. Expectation Propagation, TrueSkill, and its extensions.
- Slides ()
- 15. Approximate inference in PGMs. Loopy belief propagation. Variational approximations (idea).
- 16. Sampling and approximate inference with sampling. Markov chain Monte Carlo methods.
- Slides ()
- 17. Case study: text mining. Naive Bayes. Latent Dirichlet allocation and its extensions.
- Slides ()
- 18. Support vector machines. Kernel trick for SVMs.
- Slides ()
- 19. Case study: recommender systems. Nearest neighbors: user-based and item-based. Locality sensitive hashing.
- 20. Case study: recommender systems. SVD extensions. Additional information in recommender systems. Course review.
- Slides for lectures 19-20 ()
Selected references.
- Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer, Information Science and Statistics series, 2006.
- Kevin Murphy. Machine Learning: A Probabilistic Perspective, MIT Press, 2012.
- David J. C. MacKay. Information Theory, Inference, and Learning Algorithms. Cambridge University Press, 2003.
|