At a very high level, machine learning is the process of teaching a computer system how to make accurate predictions when fed data.
Those predictions could be answering whether a piece of fruit in a photo is a banana or an apple, spotting people crossing the road in front of a self-driving car, whether an email is spam.
The key difference from traditional computer software is that a human developer hasn’t written code that instructs the system how to tell the difference between the banana and the apple.
Instead a machine-learning model has been taught how to reliably discriminate between the fruits by being trained on a large amount of data, in this instance likely a huge number of images labelled as containing a banana or an apple.
Data, and lots of it, is the key to making machine learning possible.
Main Three components of machine learning:
- Data: There are two main ways to get the data — manual and automatic.
- Manually collected data contains far fewer errors but takes more time to collect — that makes it more expensive in general.
- Automatic approach is cheaper — you’re gathering everything you can find and hope for the best.
- Features: Also known as parameters or variables. These are the factors for a machine to look at.
- Algorithms: Most obvious part. Any problem can be solved differently. The method you choose affects the precision, performance, and size of the final model. There is one important nuance though: if the data is crappy, even the best algorithm won’t help. Sometimes it’s referred to as “garbage in – garbage out”. So don’t pay too much attention to the percentage of accuracy, try to acquire more data first.
How it works:
In a way, Machine Learning works in a similar way to human learning. For example, if a child is shown images with specific objects on them, they can learn to identify and differentiate between them. Machine Learning works in the same way: Through data input and certain commands, the computer is enabled to “learn” to identify certain objects (persons, objects, etc.) and to distinguish between them. For this purpose, the software is supplied with data and trained.
For instance, the programmer can tell the system that a particular object is a human being (=”human”) and another object is not a human being (=”no human”). The software receives continuous feedback from the programmer. These feedback signals are used by the algorithm to adapt and optimize the model. With each new data set fed into the system, the model is further optimized so that it can clearly distinguish between “humans” and “non-humans” in the end.
Some machine learning methods
Machine learning algorithms are often categorized as supervised or unsupervised.
Supervised machine learning algorithms can apply what has been learned in the past to new data using labeled examples to predict future events. Starting from the analysis of a known training dataset, the learning algorithm produces an inferred function to make predictions about the output values. The system is able to provide targets for any new input after sufficient training. The learning algorithm can also compare its output with the correct, intended output and find errors in order to modify the model accordingly.
In contrast, unsupervised machine learning algorithms are used when the information used to train is neither classified nor labeled. Unsupervised learning studies how systems can infer a function to describe a hidden structure from unlabeled data. The system doesn’t figure out the right output, but it explores the data and can draw inferences from datasets to describe hidden structures from unlabeled data.
Semi-supervised machine learning algorithms fall somewhere in between supervised and unsupervised learning, since they use both labeled and unlabeled data for training – typically a small amount of labeled data and a large amount of unlabeled data. The systems that use this method are able to considerably improve learning accuracy. Usually, semi-supervised learning is chosen when the acquired labeled data requires skilled and relevant resources in order to train it / learn from it. Otherwise, acquiring unlabeled data generally doesn’t require additional resources.
Reinforcement machine learning algorithms is a learning method that interacts with its environment by producing actions and discovers errors or rewards. Trial and error search and delayed reward are the most relevant characteristics of reinforcement learning. This method allows machines and software agents to automatically determine the ideal behavior within a specific context in order to maximize its performance. Simple reward feedback is required for the agent to learn which action is best; this is known as the reinforcement signal.
Advantages of Machine Learning:
Machine Learning undoubtedly helps people to work more creatively and efficiently. Basically, you too can delegate quite complex or monotonous work to the computer through Machine Learning – starting with scanning, saving and filing paper documents such as invoices up to organizing and editing images.
In addition to these rather simple tasks, self-learning machines can also perform complex tasks. These include, for example, the recognition of error patterns. This is a major advantage, especially in areas such as the manufacturing industry: the industry relies on continuous and error-free production. While even experts often cannot be sure where and by which correlation a production error in a plant fleet arises, Machine Learning offers the possibility to identify the error early – this saves downtimes and money.
Self-learning programs are now also used in the medical field. In the future, after “consuming” huge amounts of data (medical publications, studies, etc.), apps will be able to warn in case his doctor wants to prescribe a drug that he cannot tolerate. This “knowledge” also means that the app can propose alternative options which for example also take into account the genetic requirements of the respective patient.