Classification vs. Regression
Introduction
Here we identify a common way to organize problems: that of regression vs. classification problems.
Classification
Classification typically outputs qualitative responses, like disease status, eye color, type of car they own, etc. Often, we want to "classify" things into one group or another.
Importance Of The Output Type, Not The Input Type
You will notice that these types differ by their responses, not so much their inputs. That is, what matters here is thinking about the output: we want to determine whether we want qualitative or quantitative output.
Challenges With Distinguishing Between The Two
The lines between these distinctions can get blurry, often because converting between them is sometimes relatively easy. For instance, we can easily encode qualitative information like eye color into a number: 1 for blue, 2 for green, etc. This tends to matter most with models that are fairly inflexible.
Additionally, regression in this context doesn’t always mean the type of modeling technique. For instance, logistic regression is often a binary (or qualitative) output, whereas least squares regression is quantitative. Both are regression modeling techniques, but they differ in this context of classification. For this reason, sometimes the terms "quantitative" or "qualitative" are used to be clear here instead of "regression" or "classification".