Supervised Learning: Classification and Regression
Exploring algorithms that learn from labeled data to make predictions.
About This Topic
Supervised learning is the most widely used paradigm in practical machine learning, and it divides into two main task types: classification (predicting a category) and regression (predicting a numeric value). Spam detection is a classification problem, is this email spam or not? Predicting a house's sale price given its features is a regression problem. Understanding this distinction helps students map real-world problems to the right algorithmic approach.
Common supervised learning algorithms that 11th graders encounter include decision trees (which split data based on feature thresholds into a tree structure), linear regression (which fits a line to predict continuous outputs), and k-nearest neighbors (which classifies new points based on their closest training examples). Each has different strengths and failure modes, and students benefit from comparing them across the same dataset.
Active learning is effective here because these algorithms have geometric intuitions, decision boundaries, hyperplanes, distance metrics, that become clearer through visualization and hands-on exploration than through equations alone. Building and testing simple models on real datasets gives students direct feedback on the tradeoffs between algorithms.
Key Questions
- Explain the difference between classification and regression tasks in supervised learning.
- Analyze how algorithms like Decision Trees or Linear Regression make predictions.
- Construct a simple supervised learning model using a given dataset.
Learning Objectives
- Compare the predictive accuracy of a Decision Tree model versus a Linear Regression model on a given dataset.
- Explain the fundamental difference between classification and regression tasks using concrete examples.
- Construct a simple supervised learning model (e.g., Decision Tree or Linear Regression) to predict outcomes from a provided dataset.
- Analyze the decision boundaries or regression line generated by a chosen algorithm to understand how it makes predictions.
Before You Start
Why: Students need to understand how to organize and access data in tables or similar structures to use it for training models.
Why: Familiarity with basic algorithmic thinking, including sequential steps and conditional logic, is helpful for understanding how algorithms like Decision Trees operate.
Why: Understanding fundamental statistical concepts aids in comprehending how regression models fit data and how classification models use feature distributions.
Key Vocabulary
| Supervised Learning | A type of machine learning where an algorithm learns from a labeled dataset, meaning each data point has a known correct output or category. |
| Classification | A supervised learning task focused on predicting a discrete category or class label, such as 'spam' or 'not spam' for emails. |
| Regression | A supervised learning task focused on predicting a continuous numerical value, such as the price of a house or temperature. |
| Decision Tree | An algorithm that makes predictions by creating a tree-like structure of decisions based on feature values, splitting data at each node. |
| Linear Regression | An algorithm that models the relationship between a dependent variable and one or more independent variables by fitting a linear equation to the observed data. |
Watch Out for These Misconceptions
Common MisconceptionSupervised learning requires huge datasets to work.
What to Teach Instead
Simple algorithms like decision trees and k-nearest neighbors can work effectively on very small datasets. What matters is that the data is representative and the number of features is manageable. Deep learning often needs large datasets, but many supervised learning algorithms are designed for smaller, structured data. Students can build meaningful models with hundreds of examples.
Common MisconceptionA more complex algorithm always produces better predictions.
What to Teach Instead
Complexity increases the risk of overfitting, memorizing training data rather than learning generalizable patterns. A simple, well-tuned decision tree often outperforms a complex model on a small dataset. Algorithm selection involves tradeoffs between complexity, interpretability, and performance on the specific problem at hand.
Common MisconceptionHigh accuracy on training data means the model is good.
What to Teach Instead
A model that achieves perfect accuracy on training data may have simply memorized the examples, this is overfitting. The meaningful measure is accuracy on held-out test data the model has never seen. This is why splitting data into training and test sets before modeling is a fundamental practice, not an optional step.
Active Learning Ideas
See all activitiesCard Sort: Classification vs. Regression
Give pairs a set of problem cards (predict tomorrow's high temperature, diagnose a tumor as benign or malignant, estimate a car's resale value, classify a review as positive or negative). Partners sort them into classification or regression and write a one-sentence justification for each. Debrief addresses any cards that prompted disagreement.
Decision Tree Construction Activity
Provide groups with a small labeled dataset (e.g., 20 animals with features like size, diet, habitat) and ask them to build a decision tree by hand, choosing splits that best separate the classes. Groups compare their trees and discuss which features they chose and why. Connect to how algorithms like ID3 make these choices systematically.
Think-Pair-Share: When Does the Algorithm Fail?
Show students an example where a decision tree overfits a small dataset (perfect accuracy on training, poor on test). Ask partners to explain in their own words what went wrong and propose one fix. Share explanations with the class. This activity surfaces overfitting intuition before formally defining it.
Live Coding: Sklearn Supervised Model
Students follow along building a simple classification or regression model using scikit-learn on a provided dataset (e.g., iris flowers or housing prices). At three points, the instructor pauses and students predict what the next line of output will be before it runs. Pairs discuss predictions, then see the result. Debrief covers what the metrics mean.
Real-World Connections
- Financial analysts use regression models to predict stock prices or the potential return on investment for new projects, informing business decisions at companies like Goldman Sachs.
- Medical researchers employ classification algorithms to diagnose diseases based on patient symptoms and test results, aiding doctors in identifying conditions like cancer or diabetes.
- E-commerce platforms like Amazon use classification models to filter customer reviews, identifying and flagging those that are inappropriate or spam.
Assessment Ideas
Present students with three scenarios: predicting if a customer will click an ad, estimating a car's fuel efficiency, and identifying a handwritten digit. Ask students to label each as either a classification or regression problem and briefly justify their choice.
Provide students with a small, pre-cleaned dataset (e.g., housing features and prices). Ask them to identify the target variable and state whether this is a classification or regression task. Then, have them write one sentence describing how a Decision Tree might approach this problem.
Facilitate a class discussion: 'Imagine you are building a model to predict whether a student will pass a course. What kind of data would you need? Would this be a classification or regression problem? What are the potential ethical considerations if your model is biased?'
Frequently Asked Questions
What is the difference between classification and regression in machine learning?
How does a decision tree make predictions?
How do active learning activities help students understand supervised learning algorithms?
What is a training set and a test set and why does the split matter?
More in Artificial Intelligence and Ethics
Introduction to Artificial Intelligence
Students will define AI, explore its history, and differentiate between strong and weak AI.
2 methodologies
Machine Learning Fundamentals
Introduction to how computers learn from data through supervised and unsupervised learning.
2 methodologies
Unsupervised Learning: Clustering
Discovering patterns and structures in unlabeled data using algorithms like K-Means.
2 methodologies
AI Applications: Image and Speech Recognition
Exploring how AI is used in practical applications like recognizing images and understanding speech.
2 methodologies
Training Data and Model Evaluation
Understanding the importance of data quality, feature engineering, and metrics for model performance.
2 methodologies
Algorithmic Bias and Fairness
Investigating how human prejudices can be encoded into automated decision-making tools.
3 methodologies