Blog

🏆 Bias – Variance Tradeoff

‘As a Data Scientist – should I be a specialist or generalist? After all, data science is an ocean!’ As someone who was in his first semester pursuing his Master’s in Analytics degree, this is the question I had in my mind after the professors’ introduced a plethora of new terminologies to me in every…
Read more

PCA explained with a narrative

Continued criticisms from liberals of Essos, compelled Robert Baratheon of King’s Landing to summon the education ministry to discuss an important aspect of matriculation exam in Westeros. The objective of the meeting was to reduce the number of subjects in matriculation exam. The education system tests students on their ability on subjects, “English Literature“, “English…
Read more

Support Vector Machines – Not for the faint-hearted

Definition: Support Vector Machines a.k.a SVM is a powerful algorithm used for regression and classification problems. Just like an Artificial Neural Network algorithm, SVMs also operate as a black box, that is, it’s difficult to explain how the input gets translated into the output. The term “Support Vectors” in SVM refers to the data points…
Read more

Gradient Boosting Machines

                                                                     Illustration taken from http://uc-r.github.io/gbm_regression Boosting is a method of converting weak learners into strong learners. In boosting, each new tree is a…
Read more

The Superman Algorithm: Logistic Regression

What is Logistic Regression? Logistic Regression is one of the most popular Machine Learning algorithms for binary classification. It is a simple but powerful Algorithm which can be used as baseline, easy implementation, and can do well enough in many tasks. An example of a Logistic Regression problem is an algorithm used for cancer detection…
Read more

Introduction to Logistic Regression

                                              Introduction to Logistic Regression Logistic regression is used for binary classification (two class classification) as well as multi-class classification. A straight line may or may not classify the data in all cases.…
Read more

Bias and Variance – the struggle of daily life

Bias refers to the error that is introduced by approximating a real-life problem, which may be extremely complicated, by a much simpler model. Variance refers to the amount by which the prediction of model would change if we estimated it using a different training data set. The challenge lies in finding a method for which…
Read more

Bias-Variance Dilemma

When I actually started my journey in Data Science, it was always difficult for me to remember the difference between Bias and Variance. We always talk about the Bias-Variance tradeoff when we talk about the model prediction. My post will present a very basic understanding of these terms and two related terms – Underfitting and…
Read more

Understanding the Bias-Variance Trade-off

George Box once said, “All models are wrong, but some are useful.” From a supervised machine learning perspective, all models have errors, and to make our models useful, we have to minimize such errors. More specifically, we have to minimize two major sources of error: bias and variance. Prior to applying a machine learning algorithm,…
Read more