'; print ''; ?>

Sergey Nikolenko

Main page Books'; print '
Research papers'; print '
Talks and posters'; print '
Students'; print '
Popular science'; print '
Other stuff'; print '

Research'; print '
CS and crypto'; print '
Bioinformatics'; print '
Machine learning'; print '
Algebraic geometry'; print '
Algebra'; print '
Bayesian networks'; print '
Earth sciences'; print '

Teaching'; print '
2014'; print '
ML, KFU'; print '
Game Theory, HSE'; print '
Mech. Design, HSE'; print '
ML, CSClub Kazan'; print '
Game theory, HSE'; print '
Math. logic, AU'; print '
Machine learning, STC'; print '
Machine learning, AU'; print '
2013'; print '
Discrete math, HSE'; print '
Machine learning, STC'; print '
Math. logic, AU'; print '
Cryptography, AU'; print '
2012'; print '
Machine learning, STC'; print '
Math. logic, AU'; print '
Machine learning II, AU'; print '
Machine learning, AU'; print '
Machine learning, EMC'; print '
2011'; print '
Cryptography, AU'; print '
Math. logic, AU'; print '
Machine learning, AU'; print '
2010'; print '
Math. logic, AU'; print '
Machine learning, AU'; print '
Cryptography, AU'; print '
2009'; print '
Crypto in CS Club'; print '
Statistics'; print '
Machine learning, AU'; print '
Cryptography'; print '
2008'; print '
Speech recognition'; print '
MD for CS Club'; print '
ML for CS Club'; print '
Mechanism design'; print '
2007'; print '
Machine Learning'; print '
Probabilistic learning'; print '

External links'; print '
Google Scholar profile'; print '
DBLP profile'; print '
LiveJournal account
nikolenko (in Russian)

Teaching activities

Machine Learning at the Academic University

This is a one-semester course in machine learning presented in the spring of 2011 in the St. Petersburg Academic University. The course aims to provide a comprehensive review of machine learning (mainly from the Bayesian perspective) in a relatively short course.

The course itself (all slides and lecture notes are in Russian):

1. Introduction. History of AI. Probability theory basics. Bayes theorem and maximal a posteriori hypotheses. A review of useful distributions.: Slides ()
2. Artificial Neural Networks. Perceptrons; learning a linear perceptron. Non-linear perceptrons, sigmoid functions, gradient descent. ANNs and the backpropagation algorithm. Modifications: momenta, regularizers.: Slides (.pdf, 728kb)
3. Bayesian classifiers. The classification problem. Optimal classifier, Gibbs classifier. Naive Bayes approach. Two naive models: multivariate and multinomial naive Bayes.: Slides (.pdf, 342kb)
4. Support vector machines. Separation with linear hyperplanes, margin maximization. The kernel trick and nonlinear SVMs. SVM modifications for outlier search.: Slides (.pdf, 837kb)
5. Clustering. Hierarchical clustering. The EM algorithm. Formal EM justification. EM for clustering. k-means algorithm.: Slides (.pdf, 598kb)
6. Hidden Markov models. The three problems. The Baum-Welch algorithm and its justification. Continuous observables, time spent at states.: Slides (.pdf, 312kb)
7. Priors. Conjugate priors. Conjugate priors for Bernoulli trials. Conjugate priors for the normal distribution: learning the mean for fixed variance, learning the variance for fixed mean.: Slides (.pdf, 464kb)
8. Conjugate priors for the normal distribution: learning the mean and variance simultaneously.: Slides (.pdf, 300kb)
9. Bayesian decoding. MAP codeword decoding problem. Linear codes and dynamic programming Bayesian decoding algorithm.: Slides (.pdf, 269kb)
10. Marginalization on the graph of functions and variables. Min-product and max-sum. The general message passing algorithm.: Slides (.pdf, 1393kb)
11. Markov random fields. Moralization and triangulation. Perfect elimination orderings. Join trees and junction trees. Inference in general BBNs (with undirected cycles).: Slides (.pdf, 1248kb)
12. Approximate Bayesian inference. Variational approximations. QMR-DT and its variational inference algorithm.: Slides (.pdf, 1287kb)
13. Boltzmann machines. Mean field theory. Approximate inference and learning in Boltzmann machines.: Slides (.pdf, 574kb)
14. Sigmoid belief networks. Approximate inference in sigmoid belief networks. The LDA model, LDA inference and learning.: Slides (.pdf, 478kb)
15. Hebbian learning, bidirectional associative memory, and Hopfield networks.: Slides (.pdf, 441kb)
16. Bayesian rating models. The Elo rating. Bradley-Terry models and minorization-maximization learning algorithms. The TrueSkill model and its modifications.: Slides (.pdf, 2503kb)

Selected references

David J. C. MacKay. Information Theory, Inference, and Learning Algorithms. Cambridge University Press, 2003.
Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer, Information Science and Statistics series, 2006.
A.L. Tulupyev, S.I. Nikolenko, A.V. Sirotkin. Bayesian Networks: A Probabilistic Logic Approach. St.-Petersburg, Nauka, 2006. (two first pages of the book: .pdf, 815kb, in Russian, ozon.ru)