Introduction To MIT Data Science: Data To Insights

I have started these few threads about what I have learnt from MIT Data Science: Data To Insights course.

I highly recommend that you take up the course to learn more about the theoretical aspects of Data Science.

Week 1 – Module 1: Making sense of unstructured data


  1. What is unsupervised learning, and why is it challenging?
  2. Examples of unsupervised learning


  1. What is clustering?
  2. When to use clustering
  3. K-means preliminaries
  4. The K-means algorithm
  5. How to evaluate clustering
  6. Beyond K-means: what really makes a cluster?
  7. Beyond K-means: other notions of distance
  8. Beyond K-means: data and pre-processing
  9. Beyond K-means: big data and nonparametric Bayes
  10. Beyond clustering

Spectral Clustering, Components and Embeddings

  1. What if we do not have features to describe the data, or not all are meaningful?
  2. Finding the principal components in data, and applications
  3. The magic of eigenvectors I
  4. Clustering in graphs and networks
  5. Features from graphs: the magic of eigenvectors II
  6. Spectral clustering
  7. Modularity Clustering
  8. Embeddings: new features and their meaning

Week 2 – Module 2: Regression and Prediction

Classical Linear and Nonlinear Regression and Extensions

  1. Linear regression with one and several variable
  2. Linear regression for prediction
  3. Linear regression for causal inference
  4. Logistic and other types of nonlinear regression

Modern Regression with High-Dimensional Data

  1. Making good predictions with high-dimensional data; avoiding overfitting by validation and cross-validation
  2. Regularization by Lasso, Ridge, and their modifications
  3. Regression Trees, Random Forest, Boosted Trees

The Use of Modern Regression for Causal Inference

  1. Randomized Control Trials
  2. Observational Studies with Confounding

Week 3 – MODULE 3.1: Classification and Hypothesis Testing

Hypothesis Testing and Classification:

  1. What are anomalies? What is fraud? Spams?
  2. Binary Classification: False Positive/Negative, Precision / Recall, F1-Score
  3. Logistic and Probit regression: statistical binary classification
  4. Hypothesis testing: Ratio Test and Neyman-Pearson
  5. p-values: confidence
  6. Support vector machine: non-statistical classifier
  7. Perceptron: simple classifier with elegant interpretation

Week 4 – MODULE 3.2: Deep Learning

Deep Learning

  1. What is image classification? Introduce ImageNet and show examples
  2. Classification using a single linear threshold (perceptron)
  3. Hierarchical representations
  4. Fitting parameters using back-propagation
  5. Non-convex functions
  6. How interpret-able are its features?
  7. Manipulating deep nets (ostrich example)
  8. Transfer learning
  9. Other applications I: Speech recognition
  10. Other applications II: Natural language processing

Week 5 – MODULE 4: Recommendation Systems

Recommendations and ranking

  1. What does a recommendation system do?
  2. So what is the recommendation prediction problem? and what data do we have?
  3. Using population averages
  4. Using population comparisons and ranking

Collaborative filtering

  1. Personalization using collaborative filtering using similar users
  2. Personalization using collaborative filtering using similar items
  3. Personalization using collaborative filtering using similar users and items

Personalized Recommendations

  1. Personalization using comparisons, rankings and users-items
  2. Hidden Markov Model / Neural Nets, Bipartite graph and graphical model
  3. Using side-information
  4. 20 questions and active learning
  5. Building a system: algorithmic and system challenges


  1. Guidelines on building system
  2. Parting remarks and challenges

Week 6 – MODULE 5: Networks and Graphical Models


  1. Introduction to networks
  2. Examples of networks
  3. Representation of networks


  1. Centrality measures: degree, eigenvector, and page-rank
  2. Closeness and betweenness centrality
  3. Degree distribution, clustering, and small world
  4. Network models: Erdos-Renyi, configuration model, preferential attachment
  5. Stochastic models on networks for spread of viruses or ideas
  6. Influence maximization

Graphical models

  1. Undirected graphical models
  2. Ising and Gaussian models
  3. Learning graphical models from data
  4. Directed graphical models
  5. V-structures, “explaining away”, and learning directed graphical models
  6. Inference in graphical models: marginals and message passing
  7. Hidden Markov Model (HMM)
  8. Kalman filter