Sergey Nikolenko

Sergey Nikolenko

Main page

Books
Research papers
Talks and posters
Students
Popular science
Other stuff

   Research
CS and crypto
Bioinformatics
Machine learning
Algebraic geometry
Algebra
Bayesian networks
Earth sciences

   Teaching
 2014
ML, KFU
Game Theory, HSE
Mech. Design, HSE
ML, CSClub Kazan
Game theory, HSE
Math. logic, AU
Machine learning, STC
Machine learning, AU
 2013
Discrete math, HSE
Machine learning, STC
Math. logic, AU
Cryptography, AU
 2012
Machine learning, STC
Math. logic, AU
Machine learning II, AU
Machine learning, AU
Machine learning, EMC
 2011
Cryptography, AU
Math. logic, AU
Machine learning, AU
 2010
Math. logic, AU
Machine learning, AU
Cryptography, AU
 2009
Crypto in CS Club
Statistics
Machine learning, AU
Cryptography
 2008
Speech recognition
MD for CS Club
ML for CS Club
Mechanism design
 2007
Machine Learning
Probabilistic learning

  External links
Google Scholar profile
DBLP profile
LiveJournal account
userinfonikolenko (in Russian)

Teaching activities

Machine Learning

This course is presented in the «Academic University of Physics and Technology» as part of the recently established Chair of Mathematics and Computer Science.

The course itself (some lectures were accompanied by slides that can be found on pages of my other machine learning courses):

1. Introduction. History of AI. Decision trees. ID3. Bayesian analysis of classification problems. Complexity measures based on decision trees and bounds on decision tree complexity.
2. Artificial Neural Networks. Perceptrons; learning a linear perceptron. Non-linear perceptrons and gradient descent. ANNs and the backpropagation algorithm. Coping with overfitting.
3. Genetic algorithms. Crossover on binary strings and trees. Genetic programming. Baldwin effect. Ant Colony optimization. Simulated annealing.
4. Concept learning: Find-S and Candidate Elimination algorithms. Bayesian classifiers: optimal, Gibbs, and naive. An information-theoretical view: why the Gibbs classifier is at most twice worse than the optimal classifier.
5. Marginalization. Brute force marginalization. Marginalization by integrating. MAP estimates of the normal distribution. Prior and posterior distributions. Conjugate priors. Conjugate priors for Bernoulli trials.
6. Some coding theory: error-correcting codes, codes and their trellises. Decoding as Bayesian inference. Sum-product and min-sum algorithms.
7. Marginalization in general. The graph of functions and variables. Sum-product and min-sum in the general case: the message passing algorithm. Approximate marginalization: Laplace method.
8. Sampling. The sampling problem, why it is hard. Importance sampling and rejection sampling. MCMC (Monte-Carlo Markov Chain) methods. The Metropolis-Hastings method, Markov chains, slice sampling. Gibbs sampling.
9. The EM algorithm. EM for estimating parameters of mixtures of distributions. Maximizing the variational free energy: EM's justification.
10. Hidden Markov models. The model, three inference tasks. The Viterbi, sum-product, and Baum-Welch algorithms. Justification of the Baum-Welch algorithm. Continuous observables, distributions over time spent in a given state and other generalizations.
11. Ratings as a Bayesian inference task. The Bradley-Terry model and the Elo rating system. Minorization-maximization algorithms. The TrueSkill rating system.
12. Clustering. Graph algorithms, hierarchical clustering, FOREL. EM for clustering, k-means, fuzzy c-means.
13. Reinforcement learning. Multiarmed bandits. Reinforcement learning in a known model. Value functions, value iteration and policy iteration. Markov decision processes: Monte-Carlo method, TD-learning, Sarsa. Example: the TD-Gammon approach.
14. A case study: building a recommendation system. Approaches: naive Bayes classifier, k-means clustering, the multinomial model, the multinomial mixture model, the Aspect model. EM schemes for estimating parameters in the multinomial mixture model.