Scatter Plots and Correlation — How to Read and Draw
Grade: 8-9 | Topic: Statistics
What You Will Learn
A scatter plot reveals whether two variables are related. In this guide you will learn how to plot data on a scatter graph, identify the type and strength of correlation, draw a line of best fit by eye, and use it to make predictions — a core data analysis skill used in science, business, and research.
Theory
What is a scatter plot?
A scatter plot displays pairs of numerical data values as points on a coordinate grid. The independent variable (cause) goes on the x-axis and the dependent variable (effect) goes on the y-axis.
Each data pair becomes one dot. When you have many dots, patterns in their spread reveal the relationship between the two variables.
Types of correlation
Positive correlation: As increases, also increases. The cloud of points slopes upward from left to right.
Example: Height and shoe size — taller people tend to have larger feet.
Negative correlation: As increases, decreases. The cloud slopes downward from left to right.
Example: Hours of TV watched and exam score — more TV often correlates with lower scores.
No correlation: The dots show no clear upward or downward pattern — they are scattered randomly.
Example: Shoe size and exam score — no meaningful relationship.
Strength of correlation
- Strong correlation: points cluster tightly around an imaginary line.
- Weak correlation: points are more spread out but still show a general direction.
- Perfect correlation (rare in real data): all points lie exactly on a straight line.
The line of best fit
A line of best fit (also called a trend line) is a straight line drawn through the data so that roughly equal numbers of points are above and below it. It summarises the trend and allows predictions.
To draw it by eye:
- Identify the general direction of the data.
- Draw a line so roughly half the points are above and half below.
- The line should pass through (or near) the mean point .
Using the line of best fit to predict
Once you have the line, read off the y-value for any x-value (interpolation — within the data range) or extend the line beyond the data (extrapolation — outside the range, less reliable).
Worked Examples
Example 1 — Plotting a scatter graph
Hours studied: 1, 2, 3, 4, 5, 6
Test scores: 45, 52, 60, 68, 74, 85
Step 1: Place "hours studied" on the x-axis (0–7 scale) and "test score" on the y-axis (0–100 scale).
Step 2: Plot each pair: (1, 45), (2, 52), (3, 60), (4, 68), (5, 74), (6, 85).
Step 3: Observe that the points slope upward — positive correlation.
Example 2 — Identifying correlation type and strength
A scatter graph shows temperature (x) versus number of hot drinks sold (y). As temperature rises, fewer hot drinks are sold. Points lie close to a straight line.
Type: Negative correlation (as x rises, y falls).
Strength: Strong (points cluster tightly).
Example 3 — Drawing and using a line of best fit
For the hours-studied data above:
Step 1: Mean , mean .
Step 2: Draw the line through approximately (3.5, 64) with a positive slope, balancing points above and below.
Step 3: Prediction: For 4.5 hours studied, read the line at . The line gives approximately .
Example 4 — Outliers
An outlier is a point that falls far from the general trend. When drawing the line of best fit, do not force the line through outliers — they may represent errors or exceptional cases.
Common Mistakes
Mistake 1 — Connecting the dots
❌ Drawing lines between each consecutive data point (like a line graph).
✅ A scatter plot has isolated dots — no connecting lines. The only line drawn is the trend line (line of best fit).
Mistake 2 — Confusing correlation with causation
❌ "Ice cream sales and drowning rates are positively correlated, so ice cream causes drowning."
✅ Correlation shows a relationship, not a cause. Both ice cream sales and swimming increase in hot weather — the real cause is temperature (a confounding variable).
Mistake 3 — Extrapolating too far
❌ Using a line of best fit to predict values far outside the data range with full confidence.
✅ Predictions within the data range (interpolation) are reliable. Extrapolation becomes less accurate the further you go beyond the data.
Practice Problems
Problem 1: A scatter plot shows age (x) versus running speed (y) for people over 40. As age increases, speed decreases. What type of correlation is this?
Show Answer
Negative correlation.
Problem 2: A line of best fit passes through (2, 30) and (8, 60). What score would you predict at ?
Show Answer
Slope of line: .
Equation: , so .
At : .
Problem 3: Points on a scatter plot are spread all over with no pattern. What does this tell you?
Show Answer
No correlation — the two variables are not related.
Problem 4: A student draws a line of best fit through all the highest points, ignoring the rest. What is wrong with this?
Show Answer
The line of best fit should have roughly equal numbers of points above and below it. Drawing through only the highest points ignores the overall trend and gives inaccurate predictions.
Problem 5: Two variables have a strong positive correlation. If the x-variable increases by 10 units and the slope of the best-fit line is 2.5, what is the expected increase in y?
Show Answer
Expected increase = units.
Summary
- A scatter plot shows the relationship between two numerical variables by plotting data pairs as dots.
- Positive correlation: both increase together. Negative correlation: one increases as the other decreases. No correlation: no clear pattern.
- Strength: strong = points close to a line; weak = points more spread out.
- The line of best fit summarises the trend — draw it so roughly half the points are on each side.
- Use the line for predictions, but be cautious with extrapolation beyond the data range.
- Correlation does not imply causation.
Related Topics
- Data Collection and Analysis — how to gather and organise data before plotting
- How to Read and Interpret Bar Charts and Pie Charts — other data display methods
- The Coordinate Plane — plotting points, which underpins scatter plot graphing
Need help with scatter plots or correlation? Take a photo of your math problem and MathPal will solve it step by step. Open MathPal