DS404
Probability and Statistics: Theory and Implementation

Faculty
Andrey Khokhlov
Chief Researcher, IEPT RAS
Course length
Duration
Total hours
Credits
Language
Course type
Fee for single course
Fee for degree students
Skills you’ll learn
Overview
Nowadays there exist several branches of so-called statistical inferences that are common for mathematical statistics, data mining, and machine learning.
While statistics and probability are traditionally studied in mathematical departments, data mining and machine learning are specified to computer science education. What can be lost during the process of such divergence is the basic common knowledge that may help to avoid confusion and mistakes.
The novel contributions in data mining are mostly informal and usually linked with the Bayesian point of view. But, probability theory itself is more than just Kolmogorov's axiomatic approach: until now the frequentist approach of R. von Mises coexists with novel contributions from quantum probabilities.
This course tries to show the existing diversity of approaches while staying within classical Kolmogorov's probability theory. The necessary classical theoretical material would be explained as well. We aim to consider several paradoxical situations that arise in practice.
Most examples belong to natural sciences and simple situations in data analysis, the outcome is expected to be the practical training in data processing together with the ability to critically read a professional text. A standard undergraduate course in calculus is required, some basic experience in MATLAB or PYTHON programming would be appreciated. As an option, all the exercises and the numerical tests can be solved using OCTAVE --- the freeware clone of the MATLAB.
Learning highlights
- This course tries to show the existing diversity of approaches while staying within the classical Kolmogorov's probability theory. The necessary classical theoretical material will be explained as well.
- First, we aim to improve practical skills in probabilistic methods of data analysis: technical details nowadays are well implemented as parts of computational packages, however there are still many chances to confuse the logic of the method design.
- There exist also several similar vocabularies that are used for the significantly different approaches --- they also should be studied to avoid misinterpretations and erroneous results. The training in reading the modern and classical texts in Probability and Statistics is also an important goal of the course.
Course outline
15 classes
Class 1
Classical finite models and the need of the rigid theory. Paradoxes and Natural Sciences
Class 2
Combinatorics, cases of distinguishable and indistinguishable objects. Generation functions and other computational tools
Class 3
Conditional probabilities, the independence of events and its formal properties. Bayesian approach for finite and infinite discrete cases
Class 4
Heuristic non-finite models derived by means of the symmetry arguments. Algebra of events and the corresponding mathematical theory. Probabilities in discrete sample spaces and axiomatic approach
Class 5
Different approaches to Probabilities: the model of von Mises and the classical model. General discrete model and the idea of quantum probability model
Class 6
Random variable from formal and heuristic points of view. Examples. Scalar and vector random variables and their properties. Mutual and group independence
Class 7
Moments and other characteristics of the random variables. Chebyshev inequalities. Whether the moments are always defined
Class 8
The Law of Large Numbers and what Statistics can do. Sample space and several approaches in Statistics
Class 9
Integral valued random variables and generating functions. Computational techniques
Class 10
Basic discrete models and discrete random variables. Sequences of random variables. Random Walks model
Class 11
Binomial distribution and its approximations in limit case. The idea of the Central Limit Theorem and its meaning for the Natural Sciences. An experimental illustrations
Class 12
Distribution functions and the classification of random variables. Back to Statistics: the main problem of Classical statistics. Non-parametric criteria
Class 13
In depth: Lebeg integration and computational formulas in general cases. Rademacher functions and the sequence of independent random variables. Kolmogorov’s axioms and other approaches
Class 14
In depth: Sequences of random variables and several types of their limiting behaviour. Characteristic functions and their properties
Class 15
Central Limit Theorem and its constraints. Gaussian and non-gaussian statistics. Mixtures and randomisations. Real data and the theory
Course materials
Books
Prerequisites
A usual mathematical calculus course is required: geometry, analysis of the real-valued functions (including those of several variables), real and complex linear algebra (matrices eigenvalues and eigenvectors) and at least basic information about the Fourier transform. Test your skills in solving standard necessary tasks using https://drive.google.com/open?id=1NN72jLCETTRLGznXyantZtRL-o7S3Tcl
We take into account that most undergraduate math programmes include elements of probability theory, therefore our course is not so elementary or narrow.
Programming experience, at least in its basic meaning is expected: whenever possible we use script computer languages and write simple programmes. The desired option would be familiarity with such programming languages as Python or Matlab but this requirement is not critical.
Methodology
We stay within the so-called classical theory (A.N.Kolmogorov approach) but try to clarify the number of alternative reasoning, such as Bayesian arguments and even quantum probabilistic models.
We follow the existing theory from the ground up together with the deconstructions of standard routine examples and unexpected counterexamples. Practical training means the design and implementation of computational algorithms.
Grading
After getting his Ph.D. in Algebraic Topology in 1983 Andrey worked in several scientific and/or teaching organisations, among them are the Russian Academy of Sciences, Moscow State University, and Baumann Technology University. The Scientific advising of the graduate and thesis students was part of his activities, not only in Russia, but also in France.
Andrey’s main results in science are linked with geophysical data processing, so naturally his teaching interests are now concentrated in the applied methods of Statistics and their algorithmic implementations. He currently helps his students avoid some common errors within the probabilistic inferences and support their attempts to study Probability and Statistics theory in general.
See full profileApply for this course
Probability and Statistics: Theory and Implementation
by Andrey Khokhlov
Total hours
45 Hours
Dates
Nov 30 - Dec 18, 2020
Fee for single course
€1500
Fee for degree students
€750
How to secure your spot
Complete the form below to kickstart your application
Schedule your Harbour.Space interview
If successful, get ready to join us on campus
FAQ
Will I receive a certificate after completion?
Yes. Upon completion of the course, you will receive a certificate signed by the director of the program your course belonged to.
Do I need a visa?
This depends on your case. Please check with the Spanish or Thai consulate in your country of residence about visa requirements. We will do our part to provide you with the necessary documents, such as the Certificate of Enrollment.
Can I get a discount?
Yes. The easiest way to enroll in a course at a discounted price is to register for multiple courses. Registering for multiple courses will reduce the cost per individual course. Please ask the Admissions Office for more information about the other kinds of discounts we offer and what you can do to receive one.