Sergey NikolenkoMain page Books Research papers Talks and posters Students Popular science Other stuff Research CS and crypto Bioinformatics Machine learning Algebraic geometry Algebra Bayesian networks Earth sciences Teaching 2014 ML, KFU Game Theory, HSE Mech. Design, HSE ML, CSClub Kazan Game theory, HSE Math. logic, AU Machine learning, STC Machine learning, AU 2013 Discrete math, HSE Machine learning, STC Math. logic, AU Cryptography, AU 2012 Machine learning, STC Math. logic, AU Machine learning II, AU Machine learning, AU Machine learning, EMC 2011 Cryptography, AU Math. logic, AU Machine learning, AU 2010 Math. logic, AU Machine learning, AU Cryptography, AU 2009 Crypto in CS Club Statistics Machine learning, AU Cryptography 2008 Speech recognition MD for CS Club ML for CS Club Mechanism design 2007 Machine Learning Probabilistic learning External links Google Scholar profile DBLP profile LiveJournal account nikolenko (in Russian) | |
Teaching activities |
Machine Learning
This course is presented in the «Academic University of Physics and Technology»
as part of the recently established Chair of Mathematics and Computer Science.
The course itself (some lectures were accompanied by slides that can be found on pages of my other machine learning courses):
- 1. Introduction. History of AI. Decision trees. ID3. Bayesian analysis of classification problems. Complexity measures based on decision trees and bounds on decision tree complexity.
- 2. Artificial Neural Networks. Perceptrons; learning a linear perceptron. Non-linear perceptrons and gradient descent. ANNs and the backpropagation algorithm. Coping with overfitting.
- 3. Genetic algorithms. Crossover on binary strings and trees. Genetic programming. Baldwin effect. Ant Colony optimization. Simulated annealing.
- 4. Concept learning: Find-S and Candidate Elimination algorithms. Bayesian classifiers: optimal, Gibbs, and naive. An information-theoretical view: why the Gibbs classifier is at most twice worse than the optimal classifier.
- 5. Marginalization. Brute force marginalization. Marginalization by integrating. MAP estimates of the normal distribution. Prior and posterior distributions. Conjugate priors. Conjugate priors for Bernoulli trials.
- 6. Some coding theory: error-correcting codes, codes and their trellises. Decoding as Bayesian inference. Sum-product and min-sum algorithms.
- 7. Marginalization in general. The graph of functions and variables. Sum-product and min-sum in the general case: the message passing algorithm. Approximate marginalization: Laplace method.
- 8. Sampling. The sampling problem, why it is hard. Importance sampling and rejection sampling. MCMC (Monte-Carlo Markov Chain) methods. The Metropolis-Hastings method, Markov chains, slice sampling. Gibbs sampling.
- 9. The EM algorithm. EM for estimating parameters of mixtures of distributions. Maximizing the variational free energy: EM's justification.
- 10. Hidden Markov models. The model, three inference tasks. The Viterbi, sum-product, and Baum-Welch algorithms. Justification of the Baum-Welch algorithm. Continuous observables, distributions over time spent in a given state and other generalizations.
- 11. Ratings as a Bayesian inference task. The Bradley-Terry model and the Elo rating system. Minorization-maximization algorithms. The TrueSkill rating system.
- 12. Clustering. Graph algorithms, hierarchical clustering, FOREL. EM for clustering, k-means, fuzzy c-means.
- 13. Reinforcement learning. Multiarmed bandits. Reinforcement learning in a known model. Value functions, value iteration and policy iteration. Markov decision processes: Monte-Carlo method, TD-learning, Sarsa. Example: the TD-Gammon approach.
- 14. A case study: building a recommendation system. Approaches: naive Bayes classifier, k-means clustering, the multinomial model, the multinomial mixture model, the Aspect model. EM schemes for estimating parameters in the multinomial mixture model.
|