It is a blueprint to data science from the mathematics to algorithms. It is not completed. It is of my own interest. It is web-source based. And some material draw from the original owners' blogs. Last but not least, thanks to the teachers who guide me to mathematics. And the copyright of pictures are not permmited by the original oweners. If this material infringes your copyright, you could contact me via Github and I will delete the pictures.
I hope that it is on the ADEPT way like Kalid Azad
ADEPT Method for Learning | |
---|---|
Analogy | Tell me what it is like |
Diagram | Help me visualize it |
Example | Allow me to experience it |
Plain English | Describe it with everyday word |
Technical Definition | Discuss the formal details |
It is expected a part of A Guide to Data Science, where the overview of data science is roughly presented. It is used to attract the people to the world of data.
It includes the archicture, optimization methods and regularization, acceleration and compression of deep neural network. The state of the art is not discussed.
It includes not limited the following content:
- basic introduction to probability and statistics;
- sampling algorithm based on uniformly distributed data;
- MCMC and stochastic methods;
- generalized linear model and regression analysis;
- basic machine learning;
- numerical optimmization methods;
- some materials on artifical neural network and deep learning;
- probabilistic programming and graph models;
- other data anslysis such as topological data analysis;
- applications or models in practice such as recommender system, information retrieval, computer vision;
- other computational intelligence such as simulated annealing.
As the name shown, it is a guide thus it does not cover all the methods or technques in data science or data mining. Deeper advanced topics such as probability correct algorithms are not discussed. Until now, it is web resource driven. There are many links on each topic but no concrete examples and codes.
The basic idea is machine learning = representation+evaluation + optimization. I would like formulate every machine learning problem into numerical optimization problems. The section Numerical optimmization may be too theoretical for practioner and too simple for the research on optimization.
It is draft and notes when I learn data science. It is planned to be a open source e-book.