I have started these few threads about what I have learnt from MIT Data Science: Data To Insights course.

I highly recommend that you take up the course to learn more about the theoretical aspects of Data Science.

Syllabus:

**Week 1 – Module 1: Making sense of unstructured data**

Introduction

- What is unsupervised learning, and why is it challenging?
- Examples of unsupervised learning

Clustering

- What is clustering?
- When to use clustering
- K-means preliminaries
- The K-means algorithm
- How to evaluate clustering
- Beyond K-means: what really makes a cluster?
- Beyond K-means: other notions of distance
- Beyond K-means: data and pre-processing
- Beyond K-means: big data and nonparametric Bayes
- Beyond clustering

Spectral Clustering, Components and Embeddings

- What if we do not have features to describe the data, or not all are meaningful?
- Finding the principal components in data, and applications
- The magic of eigenvectors I
- Clustering in graphs and networks
- Features from graphs: the magic of eigenvectors II
- Spectral clustering
- Modularity Clustering
- Embeddings: new features and their meaning

**Week 2 – Module 2: Regression and Prediction**

Classical Linear and Nonlinear Regression and Extensions

- Linear regression with one and several variable
- Linear regression for prediction
- Linear regression for causal inference
- Logistic and other types of nonlinear regression

Modern Regression with High-Dimensional Data

- Making good predictions with high-dimensional data; avoiding overfitting by validation and cross-validation
- Regularization by Lasso, Ridge, and their modifications
- Regression Trees, Random Forest, Boosted Trees

The Use of Modern Regression for Causal Inference

- Randomized Control Trials
- Observational Studies with Confounding

**Week 3 – MODULE 3.1: Classification and Hypothesis Testing**

Hypothesis Testing and Classification:

- What are anomalies? What is fraud? Spams?
- Binary Classification: False Positive/Negative, Precision / Recall, F1-Score
- Logistic and Probit regression: statistical binary classification
- Hypothesis testing: Ratio Test and Neyman-Pearson
- p-values: confidence
- Support vector machine: non-statistical classifier
- Perceptron: simple classifier with elegant interpretation

**Week 4 – MODULE 3.2: Deep Learning **

Deep Learning

- What is image classification? Introduce ImageNet and show examples
- Classification using a single linear threshold (perceptron)
- Hierarchical representations
- Fitting parameters using back-propagation
- Non-convex functions
- How interpret-able are its features?
- Manipulating deep nets (ostrich example)
- Transfer learning
- Other applications I: Speech recognition
- Other applications II: Natural language processing

**Week 5 – MODULE 4: Recommendation Systems**

Recommendations and ranking

- What does a recommendation system do?
- So what is the recommendation prediction problem? and what data do we have?
- Using population averages
- Using population comparisons and ranking

Collaborative filtering

- Personalization using collaborative filtering using similar users
- Personalization using collaborative filtering using similar items
- Personalization using collaborative filtering using similar users and items

Personalized Recommendations

- Personalization using comparisons, rankings and users-items
- Hidden Markov Model / Neural Nets, Bipartite graph and graphical model
- Using side-information
- 20 questions and active learning
- Building a system: algorithmic and system challenges

Wrap-up

- Guidelines on building system
- Parting remarks and challenges

**Week 6 – MODULE 5: Networks and Graphical Models**

Introduction

- Introduction to networks
- Examples of networks
- Representation of networks

Networks

- Centrality measures: degree, eigenvector, and page-rank
- Closeness and betweenness centrality
- Degree distribution, clustering, and small world
- Network models: Erdos-Renyi, configuration model, preferential attachment
- Stochastic models on networks for spread of viruses or ideas
- Influence maximization

Graphical models

- Undirected graphical models
- Ising and Gaussian models
- Learning graphical models from data
- Directed graphical models
- V-structures, “explaining away”, and learning directed graphical models
- Inference in graphical models: marginals and message passing
- Hidden Markov Model (HMM)
- Kalman filter