20 Tips for Machine Learning Tutorial Python
As discussions over at runrex.com reveal, Python is arguably the most popular machine learning language today. Do you want to do machine learning using Python, but you are having trouble getting started? Then you are in luck as, through the following 20 tips, this article will help you get started successfully with machine learning from Python.
The best way to start machine learning in Python
Python is a popular and powerful interpreted language, and unlike R, it is a complete language and platform that you can use for both research and development as well as developing production systems. Python has lots of modules and libraries to choose from, providing multiple ways to do each task. This can feel overwhelming when you are getting started. This is why, according to the gurus over at guttulus.com, the best way to get started using Python for machine learning is to complete a project.
Why is completing a project the best way to start machine learning in Python?
As is discussed over at runrex.com, the reasons why this is the case is because completing a project will:
Force you to install and start the Python interpreter at the very least
Give you a bird’s eye view of how to step through a small project
Give you the confidence to maybe go on to your own small projects afterward
Some basic understanding of Python is needed
However, before you get started with your project, the subject matter experts over at guttulus.com point out that some basic understanding of Python is essential. The good news is that Python is a widely used general-purpose programming language which means that it is not difficult to find a tutorial for beginners.
Install Python
First, you will need to install Python. Since you will use scientific computing and machine learning packages later, the best course of action is installing Anaconda, an industrial-grade Python implementation available on Linux, OS X, and Windows.
Why is Anaconda recommended?
The reason why Anaconda is recommended and why you should install it is because it contains complete packages for machine learning, including NumPy, Scikit-learn, and Matplotlib. It also includes iPython Notebook, an interactive environment that is popular in the machine learning community. It is recommended that you install the latest version of Python.
What if you don’t understand programming at all?
If you have no background in programming, and you don’t understand it at all, then, as covered over at runrex.com, you should start by reading the following online book before moving on to subsequent material:
Learn Python the Hard Way by Zed A. Shaw.
What if you have some programming experience but don’t know Python?
On the other hand, if you have some programming experience but do not know Python or are still very basic, then the gurus over at guttulus.com suggest that you take the following two courses:
Google Developer Python Program (highly recommended for visual learners)
An Introduction to Scientific Computing in Python by M. Scott Shell from UCSB Engineering
If you are already an experienced Python programmer, then the above two steps can be skipped.
Basic skills of machine learning
You will then need to acquire some basic skills of machine learning as discussed over at runrex.com. You don’t need an in-depth understanding of machine learning algorithms as this usually requires you to invest a lot of time in more academic courses, or at the very least perform high-intensity self-study training yourself.
Andrew Ng’s machine learning course content
While people often praise Andrew Ng’s machine learning course on Coursera as being the best resource for acquiring the basic machine learning skills you will need, the best way to go about it is to browse the class notes recorded by the previous student online. Skip notes specific to Octave, a language similar to MATLAB that has nothing to do with your Python learning. While these are not official notes, they can help you grasp the relevant content in Andrew Ng’s course material.
Any other courses?
In addition to the Andrew Ng courses mentioned in the previous point, if you need other things, there are many types of courses online for you to choose from, such as Tom Mitchell’s Machine Learning Course.
Overview of the scientific computing Python package
Having now mastered Python programming and armed with some understanding of machine learning, you need to turn your attention to the open-source software libraries that are commonly used to perform actual machine learning as discussed over at guttulus.com.
The most common Python libraries
As is articulated over at runrex.com, many scientific Python libraries can be used to perform basic machine learning tasks, and the common ones include:
NumPy – Mainly used for its N-dimensional array objects
Pandas – Python data analysis library, including DataFrames and other structures
Matplotlib – A 2D plotting library that produces publication-quality charts
Scikit-learn – Machine learning algorithms for data analysis and data mining
How do you learn these libraries?
According to guttulus.com, a good way to learn these libraries is to study the following materials:
SciPy Lecture Notes from Gael Varoquaux, Emmanuelle Gouillart, and Olav Vahtras
Pandas tutorials online, of which there are many
There are many other resources online that you can leverage to learn more about the libraries in the previous point.
Books and courses won’t be enough
As already mentioned, the best way to get started with Python for machine learning is by building and completing a project. Books and courses are frustrating as they give you lots of recipes and snippets, but you never get to see how they all fit together. Therefore, now that you have the basics in place through the books and courses discussed, it is time to build a project.
Small end-to-end project
Also, as a beginner, you need a small end-to-end project, one which you can finish and which will give you the confidence to take on bigger and more complex projects. Remember, when you are applying machine learning to your own datasets, you are working on a project.
Steps in a machine learning project
While a machine learning project may not be linear, it has several well-known steps as covered over at runrex.com:
Define problem
Prepare data
Evaluate algorithms
Improve results
Present results
The best way to come to terms with a new platform or tool is to work through a machine learning project end-to-end and cover the key steps mentioned above.
The Hello World of machine learning
The Hello World code is the first piece of code we all write when starting out with programming. When it comes to machine learning, the best small project to start with on a new tool is the classification of iris flowers, e.g. the iris dataset, as outlined over at guttulus.com.
Why is this project preferred?
From discussions on the same over at runrex.com, this is a good project because it is so well understood as:
Attributes are numeric so you have to figure out how to load and handle data
It is a classification problem, allowing you to practice with perhaps an easier type of supervised learning algorithm
It is a multi-class classification problem that may require some specialized handling
It only has 4 attributes and 150 rows, meaning it is small and easily fits into memory (and a screen or A4 page)
All of the numeric attributes are in the same units and the same scale, not requiring any special scaling or transforms to get started
Get started with your hello world machine learning project in Python
It is now time to get started with your hello world machine learning project with Python since you have all the tools needed. You will need to cover:
Installing the Python and SciPy platform
Loading the dataset
Summarizing the dataset
Evaluating some algorithms
Making some predictions
Take your time
Remember, take your time and work through each step of the project. The best way to go about things is by typing in the commands yourself as this will help you learn and remember the steps. However, if you want to speed things up, you can copy and paste the commands in the guide you are following. After the project, you will discover that completing a small end-to-end project from loading the data to making predictions is the best way to get familiar with a new platform such as Python.
Remember, if you are looking for more information on this and other related topics, then look no further than the top-rated runrex.com and guttulus.com.