The only thing I want to keep after two years in Data Science
Within two years pursuing a degree in Data Science, I have gathered quite a lot of cheat sheets for myself from mathematics to data cleansing and guide to package usage in python, r and SQL. Though there are plenty of resources on machine learning, it is difficult to find a good set that well-captures and summarizes all over important concepts.
Fortunately, Shervine Amidi, a graduate student at Stanford, and Afshine Amidi, of MIT and Uber, have created such a set of resources. The VIP cheat sheets are publicly shared on their Github repo covering key top-level topics in Stanford's CS 229 Machine Learning course. I find this resource super useful when using it for revision of my final exams since it also captures all essential concepts that I studied in Data Analysis Algorithms and Applied Data Analysis at Monash University.
The content of these cheat sheets consists of:
- Supervised learning:
Linear models: Linear regression, Classification and Logistic regression, and Generalized linear models.
Support vector machines.
Generative learning: Gaussian discriminant analysis, and Naive Bayes.
Tree-based and ensemble methods.
- Unsupervised learning:
Clustering: Expectation maximization, K-means clustering, and Hierarchical clustering.
Dimension reduction: Principal component analysis and Independent component analysis.
- Deep learning:
Neural networks
Convolutional neural networks
Recurrent neural networks
Reinforcement learning and control
- Machine learning tips and tricks
- Probability basis, Linear algebra, and Calculus