'; print ''; ?>

Sergey Nikolenko

Sergey Nikolenko

Main pageBooks'; print '
Research papers'; print '
Talks and posters'; print '
Students'; print '
Popular science'; print '
Other stuff'; print '

   Research'; print '
CS and crypto'; print '
Bioinformatics'; print '
Machine learning'; print '
Algebraic geometry'; print '
Algebra'; print '
Bayesian networks'; print '
Earth sciences'; print '

   Teaching'; print '
 2014'; print '
ML, KFU'; print '
Game Theory, HSE'; print '
Mech. Design, HSE'; print '
ML, CSClub Kazan'; print '
Game theory, HSE'; print '
Math. logic, AU'; print '
Machine learning, STC'; print '
Machine learning, AU'; print '
 2013'; print '
Discrete math, HSE'; print '
Machine learning, STC'; print '
Math. logic, AU'; print '
Cryptography, AU'; print '
 2012'; print '
Machine learning, STC'; print '
Math. logic, AU'; print '
Machine learning II, AU'; print '
Machine learning, AU'; print '
Machine learning, EMC'; print '
 2011'; print '
Cryptography, AU'; print '
Math. logic, AU'; print '
Machine learning, AU'; print '
 2010'; print '
Math. logic, AU'; print '
Machine learning, AU'; print '
Cryptography, AU'; print '
 2009'; print '
Crypto in CS Club'; print '
Statistics'; print '
Machine learning, AU'; print '
Cryptography'; print '
 2008'; print '
Speech recognition'; print '
MD for CS Club'; print '
ML for CS Club'; print '
Mechanism design'; print '
 2007'; print '
Machine Learning'; print '
Probabilistic learning'; print '

  External links'; print '
Google Scholar profile'; print '
DBLP profile'; print '
LiveJournal account
userinfonikolenko (in Russian)

Teaching activities

Machine Learning at the Academic University

This is a year-long course in machine learning presented in 2012 at the St. Petersburg Academic University. The course aims to provide a comprehensive review of machine learning (mainly from the Bayesian perspective) in a relatively short course.

The course itself (all slides and lecture notes are in Russian):

1. Introduction. History of AI. Probability theory basics. Bayes theorem and maximal a posteriori hypotheses. Example: Laplace's rule.
Slides ()
2. Least squares and nearest neighbors. Statistical decision theory. Linear regression. Linear regression from the Bayesian standpoint.
Slides ()
3. Curse of dimensionality. Example: polynomial approximation, overfitting. Regularization: ridge regression. Bias-variance-noise decomposition. How ridge regression follows from Gaussian priors.
Slides (.pdf, 1991kb)
4. Linear regression: various forms of regularization. Bayesian predictions in linear regression. Equivalent kernel. Bayesian model selection.
Slides (.pdf, 1303kb)
5. Classification. Least squares for classification. Fischer linear discriminant. Perceptron and proof of its convergence.
Slides (.pdf, 1359kb)
6. Linear discriminant analysis. Quadratic discriminant analysis. Naive Bayes. Multinomial and multivariate naive Bayes.
Slides (.pdf, 1404kb)
7. Logistic regression. Iterative reweighted least squares. Multiclass logistic regression. Probit regression. Laplace approximation. Bayesian information criterion. Bayesian logistic regression.
Slides (.pdf, 584kb)
8. Support vector machines. Linear separation and max-margin classifiers. Quadratic optimization. Kernel trick and radial basis functions.
Slides (.pdf, 963kb)
9. SVM variants: ν-SVM, one-class SVM, SVM for regression. Relevance vector machines: RVM for regression, RVM for classification.
Slides (.pdf, 969kb)
10. Clustering. Hierarchical clustering. Combinatorial methods, graph algorithms for clustering. The EM algorithm, its formal justification. EM for clustering.
Slides (.pdf, 744kb)
11. Hidden Markov models. The three problems. Dynamic programming: sum-product and max-sum. The Baum-Welch algorithm. Variations on the HMM theme.
Slides (.pdf, 557kb)
12. Model combination. Bayesian averaging. Bootstrapping and bagging. Boosting: AdaBoost. Weak learners: decision trees, learning decision trees. Exponential error minimization. RankBoost.
Slides (.pdf, 742kb)
13. Artificial neural networks. Two-layered networks, error functions. Backpropagation. Example: RankNet and LambdaRank.
Slides (.pdf, 836kb)
Selected references
  1. Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer, Information Science and Statistics series, 2006.
  2. Trevor Hastie, Robert Tibshirani, and Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed., Springer, 2009.
  3. David J. C. MacKay. Information Theory, Inference, and Learning Algorithms. Cambridge University Press, 2003.