Sergey NikolenkoMain page Books Research papers Talks and posters Students Popular science Other stuff Research CS and crypto Bioinformatics Machine learning Algebraic geometry Algebra Bayesian networks Earth sciences Teaching 2014 ML, KFU Game Theory, HSE Mech. Design, HSE ML, CSClub Kazan Game theory, HSE Math. logic, AU Machine learning, STC Machine learning, AU 2013 Discrete math, HSE Machine learning, STC Math. logic, AU Cryptography, AU 2012 Machine learning, STC Math. logic, AU Machine learning II, AU Machine learning, AU Machine learning, EMC 2011 Cryptography, AU Math. logic, AU Machine learning, AU 2010 Math. logic, AU Machine learning, AU Cryptography, AU 2009 Crypto in CS Club Statistics Machine learning, AU Cryptography 2008 Speech recognition MD for CS Club ML for CS Club Mechanism design 2007 Machine Learning Probabilistic learning External links Google Scholar profile DBLP profile LiveJournal account nikolenko (in Russian) | |
Teaching activities |
Automated Speech Recognition
Fall of 2008, SPSU IFMO.
An introductory course on speech recognition, covering DSP basics, MFCC features and feature selection, hidden Markov models, ANNs and TD-ANNs, and language modeling.
A sample TeX file for the lecture notes (lecture notes for lecture 2).
Another sample TeX file (sample from lecture 9, with figures and examples).
The course itself (all slides and lecture notes are in Russian):
- 1. Introduction. ASR challenges and course plan.
- 2. Signals and systems. Convolution. Integrators and differentiators. The Dirac delta function.
- Lecture notes by Oleg Dahin (.pdf, 165kb)
- 3. Eigenfunctions. Fourier series. Plancherel theorem. Periodic convolution. Discrete Fourier series. Fourier transform. Filters and filter banks.
- Lecture notes by Andrey Borisenko and Alexander Koshevoy (.pdf, 201kb)
- 4. Kotelnikov (Nyquist-Shannon) theorem. Examples of Fourier transforms. Fast Fourier transform. Filters and rational functions. FIR and IIR filters. Windows.
- 5. Speech signals. Spectrograms. Linear predictive coding. Autocorrelations. The Levinson-Durbin algorithm. Cepstrum.
- 6. Features of a speech signal. Filter banks, LPC, mel cepstrum. Discrete cosine transform. Mel scale. MFCC.
- Lecture notes by Eugene Selifonov and Andrey Tikhomirov (.pdf, 213kb)
- 7. Feature selection. Principal components analysis. Minimizing error.
- Lecture notes by Sergey Gindin and Andrey Davydov (.pdf, 202kb)
- 8. The kernel trick. Kernel PCA. Commonly used kernels. Kernel k-means clustering.
- Lecture notes by Catherine Vasilyeva and Egor Smirnov (.pdf, 179kb)
- 9. Hidden Markov models. Dynamic programming, Viterbi algorithm. The Baum-Welch algorithm. Kullback-Leibler distance.
- Slides (.pdf, 445kb)
Lecture notes by Sergey Nikolenko (.pdf, 216kb)
- 10. Special cases of HMMs. Continuous distributions. Autoregressive HMMs. Optimization criteria: ML, MMI, MDI.
- Slides (.pdf, 478kb)
- 11. Artificial neural networks. Backpropagation.
- Lecture notes by Daniel Penkin (.pdf, 192kb)
- 12. Time-delay ANNs (TD-ANN). Temporal backpropagation.
- 13. Language modeling. Grammars, context-free grammars, Chomsky normal form. Chart parser. Probabilistic context-free grammars.
- Lecture notes by Jan Malakhovski (.pdf, 210kb)
- 14. Language modeling. N-grams. Cross-entropy and perplexity. N-gram smoothing: backoff models, interpolation, Kneser-Ney smoothing.
Selected references.
- Xuedong Huang, Alex Acero, Hsiao-Wuen Hon. Spoken language processing: a guide to theory, algorithm, and system development. Prentice Hall, 2001.
|