Call it a lucky find on Twitter. It definitively is. Santiago tweeted 20 questions you need to ace before getting a machine learning job. I figured I’d use these questions to understand developers’ work better and maybe get a glimpse into future applications.
The first questions were about various basic concepts of machine learning. Let’s imagine, for example, that we are given a puzzle as a gift. How do you put it together? Do you need the finished puzzle poster as a basis, or do you put the edge together first? Do you try to sort the colors? Depending on what kind of information you have, it makes sense to choose a different method and, therefore, a different algorithm.
The same applies to deep learning methods such as supervised, unsupervised, semi-supervised, and reinforcement learning. A scientist decides on a training model for its algorithms depending on the available data and the research question. He named this part of his tweet the “warming up” phase. A key takeaway for non-techies:
Depending on the research question, developers use a specific algorithm. The decision is based on data first, then the training model, and lastly, the algorithm.
The “Getting Deeper” Phase
In this phase, he asks questions concerning supervised learning methods. How do we learn when supervised? The teacher labels things, and we understand these things to be true (e.g., knowing all presidents by heart). In machine learning, this means that the algorithm learns on a labeled dataset to evaluate the training data. So when the question arises when to use classification over regression problems, the two main areas where supervised learning is useful are being brought up. According to Isha Salian, “classification problems ask the algorithm to predict a discrete value, identifying the input data as a member of a particular class or group, regression problems look at continuous data.” Therefore, we need to remember as non-techies:
Supervised learning models need to have a clean, well-labeled set of available reference points or a ground truth to train the algorithm.
If we want to distinguish dogs and cats in pictures and have a precise data set together, it makes more sense to use one method. Whatever the scientist chooses, he does so based on the research question and the possibilities that the data offers.
The “This Is About To Get Real” Phase
What is the curse of dimensionality? It turns out that this phenomenon affects different areas when analyzing and organizing data sets in high-dimensional spaces. The more features your data has, the more difficult it gets to “get to the point,” as Tony Yiu nicely puts it in his article about the curse. Because of this curse, scientists need to use dimensionality reduction algorithms such as the PCA (Principal Components Analysis) algorithm. You’ve definitively had confusing and neverending conversations with a friend or family member — you know the ones that start with a trip to the grocery store and end with a funeral? From what I gather, it’s precisely that. The curse prevents you from having a functioning deep learning model that can turn up with patterns and tangible results. The key takeaway:
Too many features equal too many distractions: if the features outweigh the observations, scientists and developers run the risk of massively overfitting their model.
The “Let’s Now Get Deep Into It Phase”
The outcome scientist and developers would like to achieve low-bias and low-variance to achieve the best accuracy. However, Jason Brownlee from Machine Learning Mastery Pty. Ltd. concludes that “we cannot calculate the real bias and variance error terms because we do not know the actual underlying target function. Nevertheless, as a framework, bias and variance provide the tools to understand machine learning algorithms’ behavior in the pursuit of predictive performance”.
The goal is to achieve the best possible balance act of low-bias and low-variance. Unfortunately, there aren’t a lot of methods to avoid this trade-off.
In his tweet, Santiago asks about how data scientists can measure a machine learning model’s accuracy on a dataset using the F1-score. The F-score combines the model’s precision and recall and defines the harmonic mean of the model’s precision and recall.
Scientists use the F-score in natural language processing applications, for instance, when evaluating named entity recognition and word segmentation.
The Last Question: Model Accuracy or Model Performance?
Model accuracy shows how close the predicted values are to the target values. Model performance can be speed. However, scientifically speaking, performance can also state how well the model has executed the task according to the user requirements. In other words: It depends on the application.
If we say model performance is tied to speed, then some applications require real-time performance — for instance, a conveyor belt carrying tomatoes with the task to the green from the red ones. Though an occasional error is undesired, this machine’s success is more determined by its ability to withstand its throughput. However, if we turn to medical applications, accuracy is more important than speed. If this application should support doctors in diagnosing patients, I’d prefer an accurate application over a speedy one.
What amazed me while answering the questions was the many paradoxes and balancing acts that come with the territory. So I think that these twenty questions are of great value for prospective developers, as they address different methodologies and raise more advanced questions. While I tried to make sense of as many questions as possible, needless to say, I would definitively flunk the interview. What about you?
Let me know what you think on Twitter, and as always - stay curious!