The only thing I want to keep after two years in Data Science

Patrick Ho
Nov 20, 2018
1 min read

Within two years pursuing a degree in Data Science, I have gathered quite a lot of cheat sheets for myself from mathematics to data cleansing and guide to package usage in python, r and SQL. Though there are plenty of resources on machine learning, it is difficult to find a good set that well-captures and summarizes all over important concepts.

Fortunately, Shervine Amidi, a graduate student at Stanford, and Afshine Amidi, of MIT and Uber, have created such a set of resources. The VIP cheat sheets are publicly shared on their Github repo covering key top-level topics in Stanford's CS 229 Machine Learning course. I find this resource super useful when using it for revision of my final exams since it also captures all essential concepts that I studied in Data Analysis Algorithms and Applied Data Analysis at Monash University.

The content of these cheat sheets consists of:

- Supervised learning:

Linear models: Linear regression, Classification and Logistic regression, and Generalized linear models.
Support vector machines.
Generative learning: Gaussian discriminant analysis, and Naive Bayes.
Tree-based and ensemble methods.

- Unsupervised learning:

Clustering: Expectation maximization, K-means clustering, and Hierarchical clustering.
Dimension reduction: Principal component analysis and Independent component analysis.

- Deep learning: