Guest Post: This post is authored by Suram Saraswati Anugna, it gives a general overview of the machine learning space. Please also check out her talk in the Machine Learning Made Simple webinar series.
Are you getting automatic recommendations for which movies to watch next on Netflix and Amazon Prime? Or do you get options for people you might know on Facebook or LinkedIn? Virtual Personal Assistants like Siri, Alexa Cortana, etc. That’s all machine learning! This is a technology that is growing in popularity. Most likely, machine learning is used in almost every technology around you!
And it’s not a new concept. Arthur Samuel, a computer scientist at IBM is credited for coining the term machine learning in 1952. Researchers have always been fascinated by the ability of machines to learn without being programmed in detail by humans; however, this has gotten a lot easier with the advent of big data in modern times. create much more accurate machine learning algorithms that are actually workable in the tech industry. As a result, machine learning is a buzzword in the industry today, even though it has been around for a long time. But are you wondering what machine learning is anyway? its subsets and what are the different machine learning algorithms. Read on to find the answers to all of your questions.
Let’s get started!!
First things first, let us look at the formal definition of Machine Learning. We can say that a Machine Learning algorithm learns from experience E with respect to some type of task T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.
For example, let us take the case of a Machine Learning algorithm used to play chess. Then the experience E is playing many games of chess, the task T is playing chess with many players, and the performance measure P is the probability that the algorithm will win in the game of chess.
Artificial Intelligence (AI) and Machine Learning are correlated with each other with a few key differences. AI is defined as a program that exhibits cognitive ability similar to that of a human being. Making computers think like humans and solve problems the way we do is one of the main tenets of artificial intelligence. On the other hand, Machine Learning is a subset of Artificial intelligence that ambitions to create machines that can learn autonomously from data. Machine Learning is all about finding mathematical patterns in data to make future predictions.
Types of Machine Learning
As with any method, there are one-of-a-kind approaches to inculcate system mastering algorithms with its respective blessings and disadvantages. To recognize their pros and cons, however, we need to first study what type of data they ingest. In Machine Learning, there are mainly two varieties of data — labeled data and unlabeled data.
Labeled data has both input and output parameters in a very system-readable pattern. Labelling the data, however, requires human effort to start with. Unlabeled data points, at best, have few or none of the parameters in a system-readable form. This negates the want for human hard work but calls for more complicated solutions.
There are also a few styles of algorithms that can be utilized in very precise use-cases. In this article, I will discuss 4 predominant strategies that are used today:
With Supervised Learning, you feed the labelled output of your algorithm to the machine to train it. The machine is then already aware of the output of the algorithm before training or learning from it. For a very (a little too) fundamental analogy, imagine a teacher supervising a class. The teacher already knows the correct answers but the learning process doesn’t stop until the students learn the answers as well.
Supervised Machine Learning presently makes up a majority of the applications that are being utilized throughout the world. The input variable (x) is associated with the output variable (y) through the entire use of the algorithm. Even though the data needs to be labeled accurately for this method to work, supervised learning is extremely powerful when used in the right circumstances.
Supervised learning is further categorized into Regression and Classification.
Regression: Problems that may be categorized as regression consist of those wherein the output variables are real numbers or continuous variables. Examples could be height, weight and salary.
Classification: Problems that are categorized as classification issues are those wherein the output variables are set as class labels or categorical variables. For example: a customer defaulting on their loan (yes or no) or an email being classified (spam or not).
In this case, there is no teacher in the class and the students are left to learn for themselves (a lot like real life!). So for Unsupervised Machine Learning algorithms, there is no specific answer to be learned and there are no predefined labels. In this way, the algorithm doesn’t figure out any output for input but explores the data for intrinsic patterns. The algorithm is left unsupervised to find the underlying pattern in the data in order to learn more and more about the data itself.
For a relatable analogy, let us consider a visit to your friend’s house. There’s a football match on TV, you don’t know anything about football but you end up watching it just because your friends are enjoying it. Here, you try to figure out the rules of the game while watching the two teams playing; different types of players, goals and penalties. Similar is the working of unsupervised learning, constructing patterns that would best suit the data for accurate predictions.
Unsupervised learning, particularly for anomaly detection, could be easier to work with since no prior patterns need to be fed into the algorithm. Unsupervised Learning is further categorized into Clustering and Association.
The students learn both from their teacher and by themselves in Semi-Supervised Machine Learning. This is a combination of Supervised and Unsupervised Machine Learning that uses a little amount of labeled data like Supervised Machine Learning and a larger amount of unlabeled data like Unsupervised Machine Learning to train the algorithms.
First, the labeled data is used to partially train the Machine Learning Algorithm, and then this partially trained model is used to pseudo-label the rest of the unlabeled data. Finally, the Machine Learning Algorithm is fully trained using a combination of labeled and pseudo-labeled data.
Well, here are the hypothetical students who learn from their own mistakes over time (that’s like life!). So, the Reinforcement Machine Learning Algorithms learn optimal actions through trial and error. This algorithm is different from the above mentioned as it works on the concept of interpreter and reward system. The algorithm is rewarded when outcomes are favorable and punished for unfavorable outcomes. The interpreter reiterates, in case of an unfavorable outcome, until the best results are obtained.
In typical reinforcement learning use-cases, such as finding the shortest route between two points on a map, the solution is not an absolute value. Instead, it takes on a score of effectiveness, expressed in a percentage value. The higher this percentage value is, the more reward is given to the algorithm. Thus, the program is trained to give the best possible solution for the best possible reward.