The course offers an introduction to the most popular multivariate statistical techniques: principal components analysis, cluster analysis, discriminant analysis, correspondence analysis
Hardle, W. and Simar, L. (2012). Applied multivariate statistical analysis (3rd Edition). Springer-Verlag Berlin Heidelberg.
Further readings:
Johnson R.A., Wichern D.W. (2007). Applied Multivariate Statistical Analysis (6th Edition). Pearson, Prentice Hall.
Zani, S. e Cerioli, A. (2007). Analisi dei dati e data mining per le decisioni aziendali. Giuffrè Milano
Learning Objectives
The course aims to provide knowledge and understanding of theory and practice of some popular multivariate techniques of dimension reduction, cluster analysis, discriminant analysis. Concerning the practical aspect, the course will involve the implementations of the considered techniques through the statistical software R.
Prerequisites
Preparatory courses:
STATISTICA I and ALGEBRA LINEARE E GEOMETRIA ANALITICA
Teaching Methods
Frontal lectures, exercises, and data lab sessions.
There will be also some
homework assignments, whose solutions will be subject of classroom discussion
Further information
Additional teaching materials will be provided during the course through the e-learning platform
Type of Assessment
Written exam (also including data analysis with R) and oral exam
Course program
1 Introduction to multivariate statistical analysis
1.1 Graphical representations of multivariate data
1.2 Random vectors and summary statistics
2 Multivariate distributions
2.1 Distribution and density functions
2.2 Moments of multivariate distributions
2.3 Transformations
2.4 The Multinormal Distribution and its elementary properties
3 Decomposition of data matrices by factors
3.1 Projecting rows in subspaces
3.2 Projecting columns in subspaces
3.3 Relations between subspaces
4 Principal Components Analysis
4.1 Principal Components
4.2 Selecting the number of Principal Components
4.3 Interpretation of results of Principal Component Analysis
5 Cluster Analysis
5.1 Proximity between objects
5.2 Clustering algorithms
6 Discriminant Analysis 6.1 Allocation rules for known distributions: Maximum Likelihood discriminant rule. Bayes discriminant rule. Minimization of the expected cost of misclassification.
6.2 Fisher Linear Discriminant Analysis
7 Simple Correspondence Analysis
7.1 Chi-Square decomposition and total inertia of a contingency matrix
7.2 Correspondence Analysis in practice