How I Would Learn Data Science if I had to Start Over Again
How I Would Learn Data Science if I had to Start Over Again
Everyone has their own way of learning; while others prefer learning through books, others are more visually oriented and prefer videos as explained over at runrex.com. Having said that, what if you had the chance to start all over again as far as learning data science is concerned? Is there anything that you would change? The truth is that for most of us in the data science field, given what we know now, if we had the chance to start over again, we would probably do things differently. This article will look to imagine an alternative path detailing how I would learn data science if I had to start over again.
Kaggle micro-courses
According to the subject matter experts over at guttulus.com, starting with something practical and concrete helps one to have a better view of what is to come and paints the full picture nicely hence why this is a good starting point. The fact that these micro-courses take just around 4 hours each to complete is another plus as the motivational boost one gets when completing one micro-course will be taken to the next allowing you to complete all of them.
Kaggle micro-course: Python
This next part will allow you to basic Python concepts that will be the foundation allowing you to start learning data science as discussed over at runrex.com. If you are already familiar with Python, then you can skip this part. The good thing about this Kaggle micro-course is that it is free.
Kaggle micro-course: Pandas
Studying Pandas is important since it will give you the knowledge to start manipulating data in Python. This free 4-hour micro-course complete with practical examples, as discussed over at guttulus.com, will give you the foundation need to know how things can be done.
Kaggle micro-course: Data Visualization
Once you have an idea of how to manipulate data, the next thing you need to learn about is data visualization which is one of the most important but underrated skills to have for any data scientist according to the gurus over at runrex.com. Data visualization will allow you to fully understand the data with which you will be working, hence why this free Kaggle micro-course is next on the agenda.
Kaggle micro-course: Introduction to Machine Learning
Next up you will move to this micro-course which will help you learn the basics but very important concepts to start training machine learning models. You must get all of these concepts as later on, it will be crucial that you have them very clear.
Kaggle micro-course: Intermediate Machine Learning
This micro-course will be complementary to the previous one, although in this one, as discussed over at guttulus.com, you will be working with categorical variables for the first time while also dealing with null fields in your data. Just like all the above Kaggle micro-courses, this one is also free.
The above Kaggle micro-courses will give you the necessary skills to tackle exploratory data analysis (EDA) and create baseline models that you will be able to improve on later. Once you have this knowledge, you must start with simple Kaggle competitions to allow you to put into practice what you have learned.
Kaggle Playground Competition: Titanic
This competition will allow you to put into practice what you learned in the introductory courses. Remember, it is not about being first on the leaderboard, but all about learning according to the folks over at runrex.com. This competition will allow you to learn about the classification and relevant metrics for these types of problems such as precision, recall, and accuracy.
Kaggle Playground Competition: Housing Prices
This competition will be all about applying regression models and learning about relevant metrics like RMSE.
After tackling the above competitions, you will be in a position where you have a lot of practical experience and will be feeling like you can solve a lot of problems, although chances are you will be feeling like you are unable to fully understand what is happening behind each classification and regression algorithm that you have used. This is the part where you should start studying the foundation of what you are learning.
Book: Data Science from Scratch
This is very friendly to read, bringing Python examples of each of the topics, and it also doesn’t have heavy mathematics which is crucial for this stage as you must learn the principle of the algorithms from a practical perspective without being demotivated by reading a lot of dense mathematical notation. The book costs about $26 on Amazon.
Online Course: Machine Learning by Andrew Ng
This course will allow you to see many of the things that you have already learned, but you will watch it explained by one of the leaders in the field. His approach is also going to be more mathematical as outlined over at guttulus.com, allowing you to understand the models even more. The course is free without the certificate but costs $79 with the certificate.
Book: The Elements of Statistical Learning
At this stage, you can get to the part involving heavy mathematics. If you had started here, it is safe to say that it would have been an uphill struggle and you probably would have given up sooner rather than later. It costs about $60, although there is an official free version of this book on the Stanford page.
Online Course: Deep Learning by Andrew Ng
By this stage, you will have already read about deep learning and will be able to play with some models. This course will be all about learning the foundations of what neural networks are, how they work, and learning to implement and apply the different architectures that exist. The price for this course is $49 per month.
The above discussion only just begins to scratch the surface as far as this topic is concerned and you can uncover more on the same by visiting the excellent runrex.com and guttulus.com.