'; print ''; ?>

Sergey Nikolenko

Main page Books'; print '
Research papers'; print '
Talks and posters'; print '
Students'; print '
Popular science'; print '
Other stuff'; print '

Research'; print '
CS and crypto'; print '
Bioinformatics'; print '
Machine learning'; print '
Algebraic geometry'; print '
Algebra'; print '
Bayesian networks'; print '
Earth sciences'; print '

Teaching'; print '
2014'; print '
ML, KFU'; print '
Game Theory, HSE'; print '
Mech. Design, HSE'; print '
ML, CSClub Kazan'; print '
Game theory, HSE'; print '
Math. logic, AU'; print '
Machine learning, STC'; print '
Machine learning, AU'; print '
2013'; print '
Discrete math, HSE'; print '
Machine learning, STC'; print '
Math. logic, AU'; print '
Cryptography, AU'; print '
2012'; print '
Machine learning, STC'; print '
Math. logic, AU'; print '
Machine learning II, AU'; print '
Machine learning, AU'; print '
Machine learning, EMC'; print '
2011'; print '
Cryptography, AU'; print '
Math. logic, AU'; print '
Machine learning, AU'; print '
2010'; print '
Math. logic, AU'; print '
Machine learning, AU'; print '
Cryptography, AU'; print '
2009'; print '
Crypto in CS Club'; print '
Statistics'; print '
Machine learning, AU'; print '
Cryptography'; print '
2008'; print '
Speech recognition'; print '
MD for CS Club'; print '
ML for CS Club'; print '
Mechanism design'; print '
2007'; print '
Machine Learning'; print '
Probabilistic learning'; print '

External links'; print '
Google Scholar profile'; print '
DBLP profile'; print '
LiveJournal account
nikolenko (in Russian)

Teaching activities

Machine Learning

This course is presented in the «Academic University of Physics and Technology» as part of the recently established Chair of Mathematics and Computer Science.

The course itself (some lectures were accompanied by slides that can be found on pages of my other machine learning courses):

1. Introduction. History of AI. Decision trees. ID3. Bayesian analysis of classification problems. Complexity measures based on decision trees and bounds on decision tree complexity.
2. Artificial Neural Networks. Perceptrons; learning a linear perceptron. Non-linear perceptrons and gradient descent. ANNs and the backpropagation algorithm. Coping with overfitting.
3. Genetic algorithms. Crossover on binary strings and trees. Genetic programming. Baldwin effect. Ant Colony optimization. Simulated annealing.
4. Concept learning: Find-S and Candidate Elimination algorithms. Bayesian classifiers: optimal, Gibbs, and naive. An information-theoretical view: why the Gibbs classifier is at most twice worse than the optimal classifier.
5. Marginalization. Brute force marginalization. Marginalization by integrating. MAP estimates of the normal distribution. Prior and posterior distributions. Conjugate priors. Conjugate priors for Bernoulli trials.
6. Some coding theory: error-correcting codes, codes and their trellises. Decoding as Bayesian inference. Sum-product and min-sum algorithms.
7. Marginalization in general. The graph of functions and variables. Sum-product and min-sum in the general case: the message passing algorithm. Approximate marginalization: Laplace method.
8. Sampling. The sampling problem, why it is hard. Importance sampling and rejection sampling. MCMC (Monte-Carlo Markov Chain) methods. The Metropolis-Hastings method, Markov chains, slice sampling. Gibbs sampling.
9. The EM algorithm. EM for estimating parameters of mixtures of distributions. Maximizing the variational free energy: EM's justification.
10. Hidden Markov models. The model, three inference tasks. The Viterbi, sum-product, and Baum-Welch algorithms. Justification of the Baum-Welch algorithm. Continuous observables, distributions over time spent in a given state and other generalizations.
11. Ratings as a Bayesian inference task. The Bradley-Terry model and the Elo rating system. Minorization-maximization algorithms. The TrueSkill rating system.
12. Clustering. Graph algorithms, hierarchical clustering, FOREL. EM for clustering, k-means, fuzzy c-means.
13. Reinforcement learning. Multiarmed bandits. Reinforcement learning in a known model. Value functions, value iteration and policy iteration. Markov decision processes: Monte-Carlo method, TD-learning, Sarsa. Example: the TD-Gammon approach.
14. A case study: building a recommendation system. Approaches: naive Bayes classifier, k-means clustering, the multinomial model, the multinomial mixture model, the Aspect model. EM schemes for estimating parameters in the multinomial mixture model.