top of page

The only thing I want to keep after two years in Data Science

Within two years pursuing a degree in Data Science, I have gathered quite a lot of cheat sheets for myself from mathematics to data cleansing and guide to package usage in python, r and SQL. Though there are plenty of resources on machine learning, it is difficult to find a good set that well-captures and summarizes all over important concepts.

Fortunately, Shervine Amidi, a graduate student at Stanford, and Afshine Amidi, of MIT and Uber, have created such a set of resources. The VIP cheat sheets are publicly shared on their Github repo covering key top-level topics in Stanford's CS 229 Machine Learning course. I find this resource super useful when using it for revision of my final exams since it also captures all essential concepts that I studied in Data Analysis Algorithms and Applied Data Analysis at Monash University.

The content of these cheat sheets consists of:

- Supervised learning:

  • Linear models: Linear regression, Classification and Logistic regression, and Generalized linear models.

  • Support vector machines.

  • Generative learning: Gaussian discriminant analysis, and Naive Bayes.

  • Tree-based and ensemble methods.

- Unsupervised learning:

  • Clustering: Expectation maximization, K-means clustering, and Hierarchical clustering.

  • Dimension reduction: Principal component analysis and Independent component analysis.

- Deep learning:

  • Neural networks

  • Convolutional neural networks

  • Recurrent neural networks

  • Reinforcement learning and control

- Machine learning tips and tricks

- Probability basis, Linear algebra, and Calculus

bottom of page