Sergey Nikolenko![Sergey Nikolenko](sergey150_6.jpg) Main page
print ' Books';
print ' Research papers';
print ' Talks and posters';
print ' Students';
print ' Popular science';
print ' Other stuff';
print ' Research';
print ' CS and crypto';
print ' Bioinformatics';
print ' Machine learning';
print ' Algebraic geometry';
print ' Algebra';
print ' Bayesian networks';
print ' Earth sciences';
print ' Teaching';
print ' 2014';
print ' ML, KFU';
print ' Game Theory, HSE';
print ' Mech. Design, HSE';
print ' ML, CSClub Kazan';
print ' Game theory, HSE';
print ' Math. logic, AU';
print ' Machine learning, STC';
print ' Machine learning, AU';
print ' 2013';
print ' Discrete math, HSE';
print ' Machine learning, STC';
print ' Math. logic, AU';
print ' Cryptography, AU';
print ' 2012';
print ' Machine learning, STC';
print ' Math. logic, AU';
print ' Machine learning II, AU';
print ' Machine learning, AU';
print ' Machine learning, EMC';
print ' 2011';
print ' Cryptography, AU';
print ' Math. logic, AU';
print ' Machine learning, AU';
print ' 2010';
print ' Math. logic, AU';
print ' Machine learning, AU';
print ' Cryptography, AU';
print ' 2009';
print ' Crypto in CS Club';
print ' Statistics';
print ' Machine learning, AU';
print ' Cryptography';
print ' 2008';
print ' Speech recognition';
print ' MD for CS Club';
print ' ML for CS Club';
print ' Mechanism design';
print ' 2007';
print ' Machine Learning';
print ' Probabilistic learning';
print ' External links';
print ' Google Scholar profile';
print ' DBLP profile';
print ' LiveJournal account
nikolenko (in Russian) | ';
print '![](spacer.gif) | ';
?>
Teaching activities |
Machine Learning at the Academic University
This is a year-long course in machine learning presented in
2012 at the St. Petersburg Academic University. The course aims to provide a
comprehensive review of machine learning (mainly from the Bayesian perspective) in a relatively short course.
The course itself (all slides and lecture notes are in Russian):
- 1. Introduction. History of AI. Probability theory basics. Bayes theorem and maximal a posteriori hypotheses. Example: Laplace's rule.
- Slides ()
- 2. Least squares and nearest neighbors. Statistical decision theory. Linear regression. Linear regression from the Bayesian standpoint.
- Slides ()
- 3. Curse of dimensionality. Example: polynomial approximation, overfitting. Regularization: ridge regression. Bias-variance-noise decomposition. How ridge regression follows from Gaussian priors.
- Slides (.pdf, 1991kb)
- 4. Linear regression: various forms of regularization. Bayesian predictions in linear regression. Equivalent kernel. Bayesian model selection.
- Slides (.pdf, 1303kb)
- 5. Classification. Least squares for classification. Fischer linear discriminant. Perceptron and proof of its convergence.
- Slides (.pdf, 1359kb)
- 6. Linear discriminant analysis. Quadratic discriminant analysis. Naive Bayes. Multinomial and multivariate naive Bayes.
- Slides (.pdf, 1404kb)
- 7. Logistic regression. Iterative reweighted least squares. Multiclass logistic regression. Probit regression. Laplace approximation. Bayesian information criterion. Bayesian logistic regression.
- Slides (.pdf, 584kb)
- 8. Support vector machines. Linear separation and max-margin classifiers. Quadratic optimization. Kernel trick and radial basis functions.
- Slides (.pdf, 963kb)
- 9. SVM variants: ν-SVM, one-class SVM, SVM for regression. Relevance vector machines: RVM for regression, RVM for classification.
- Slides (.pdf, 969kb)
- 10. Clustering. Hierarchical clustering. Combinatorial methods, graph algorithms for clustering. The EM algorithm, its formal justification. EM for clustering.
- Slides (.pdf, 744kb)
- 11. Hidden Markov models. The three problems. Dynamic programming: sum-product and max-sum. The Baum-Welch algorithm. Variations on the HMM theme.
- Slides (.pdf, 557kb)
- 12. Model combination. Bayesian averaging. Bootstrapping and bagging. Boosting: AdaBoost. Weak learners: decision trees, learning decision trees. Exponential error minimization. RankBoost.
- Slides (.pdf, 742kb)
- 13. Artificial neural networks. Two-layered networks, error functions. Backpropagation. Example: RankNet and LambdaRank.
- Slides (.pdf, 836kb)
Selected references
- Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer, Information Science and Statistics series, 2006.
- Trevor Hastie, Robert Tibshirani, and Jerome Friedman, The Elements of Statistical Learning: Data Mining, Inference, and
Prediction, 2nd ed., Springer, 2009.
- David J. C. MacKay. Information Theory, Inference, and Learning Algorithms. Cambridge University Press, 2003.
|