The Bias and Variance tradeoff
Developing the machine learning models isn’t so hard nowadays, due a lot of frameworks and libraries that give build-in functions to easily implement some fancy model. However, it’s important to understand the theory behind all practical implementation in order to build more accurate models. So in this topic will be discussed the theoretical understanding of the prediction errors, or the bias and variance tradeoff.
In simple words, the bias is a difference between the mean prediction of the model and the correct values and the variance is the dispersion of the prediction for each example. It’s also important to remember the meaning of overfitting and underfitting. When the bias is high and variance is low, the model isn’t capable to adapt to the training data (high bias) but it generalize well to unseen data (low variance), it’s called underfitting. When the opposite happens, the model perfectly adjust to the training data (low bias) but isn’t capable to generalize on the new data, it’s case of overfitting.
After remembering the crucial keywords for this topic, it’s time to discuss the bias and variance tradeoff. Normally, or the model is very simple (a few parameters) and has high bias and low variance, or very complex (large number of parameters) with low bias and high variance. It seems paradoxical, because the model must be simple and complex at the same time, but the main point is that the real mastery is to find the balance so that the model can be as precise as possible. In short, training the machine learning model is like training two annoying acrobats on a fragile trunk, finding harmony and stability is not easy if we do not know the character of each of the acrobats.
By: Vladimir Balayan