Data Collection and Analysis — Tables, Charts, and Surveys
Grade: 6-7 | Topic: Statistics
What You Will Learn
After this lesson you will know how data is collected through surveys, observations, and experiments. You will be able to organize raw data into frequency tables and tally charts, identify the difference between categorical and numerical data, and draw conclusions by analyzing organized data. These skills form the foundation for all statistics work.
Theory
Types of Data
Before collecting data, you need to understand what kind of data you are working with.
Categorical data (also called qualitative data) describes qualities or groups:
- Favorite subject: Math, Science, English
- Type of pet: Dog, Cat, Fish
- Transportation mode: Bus, Walk, Car
Numerical data (also called quantitative data) consists of numbers:
- Test scores: 75, 82, 91
- Heights in cm: 148, 155, 162
- Number of siblings: 0, 1, 2, 3
Numerical data can be further divided:
- Discrete — countable, whole numbers (number of students, number of pets)
- Continuous — measurable, can include decimals (height, weight, time)
Methods of Data Collection
| Method | What It Is | Example |
|---|---|---|
| Survey / Questionnaire | Ask people questions | "What is your favorite sport?" |
| Observation | Watch and record what happens | Counting cars at an intersection |
| Experiment | Test something under controlled conditions | Measuring plant growth with different amounts of water |
| Existing data | Use records that already exist | Census data, school attendance records |
Designing a Good Survey
A good survey gives reliable results. Follow these rules:
- Use a large enough sample — asking 5 people is not enough; ask at least 30 if possible.
- Choose a representative sample — make sure the people you ask reflect the whole group.
- Avoid leading questions — "Don't you think pizza is the best lunch?" pushes people toward a specific answer.
- Keep questions clear — avoid double-barreled questions like "Do you like math and science?"
- Use closed-ended questions for easy analysis — "What is your favorite subject: Math, Science, English, or Other?"
Organizing Data: Tally Charts and Frequency Tables
Raw data is hard to interpret. Organizing it makes patterns visible.
Tally chart: Uses tally marks (groups of 5, written as four strokes with a diagonal cross-stroke) to count occurrences as you collect data.
Frequency table: Lists each category or value with its frequency (count).
| Value/Category | Tally | Frequency |
|---|---|---|
| Category A | IIIII II | 7 |
| Category B | IIIII | 5 |
| Category C | III | 3 |
| Total | 15 |
Relative frequency shows each category as a fraction or percentage of the total:
Drawing Conclusions from Data
Once data is organized, you can:
- Identify the mode — the category or value with the highest frequency
- Calculate the mean — the average of numerical data
- Find the range — the spread from smallest to largest
- Spot trends — patterns or changes over time
- Compare groups — see if one group differs from another
Worked Examples
Example 1: Building a Frequency Table (Easy)
Problem: A teacher asked 20 students about their favorite fruit. The responses were: Apple, Banana, Apple, Orange, Banana, Apple, Grape, Banana, Apple, Orange, Banana, Grape, Apple, Banana, Orange, Apple, Banana, Orange, Grape, Apple.
Build a frequency table and find the mode.
Step 1: Count each fruit:
| Fruit | Frequency |
|---|---|
| Apple | 7 |
| Banana | 6 |
| Orange | 4 |
| Grape | 3 |
| Total | 20 |
Step 2: The mode is the fruit with the highest frequency.
Answer: The mode is Apple (frequency 7). Apple is the most popular fruit.
Example 2: Calculating Relative Frequency (Easy)
Problem: Using the fruit data above, what is the relative frequency of Banana?
Step 1: Apply the formula:
Answer: The relative frequency of Banana is 30% — almost one-third of the students prefer bananas.
Example 3: Grouped Frequency Table for Numerical Data (Medium)
Problem: Test scores for 15 students: 52, 67, 71, 45, 88, 73, 61, 95, 78, 82, 56, 69, 84, 77, 63. Organize into groups of 10 (40-49, 50-59, etc.) and identify which range has the most students.
Step 1: Sort into groups:
| Score Range | Tally | Frequency |
|---|---|---|
| 40-49 | $ | $ |
| 50-59 | $ | |
| 60-69 | $ | |
| 70-79 | $ | |
| 80-89 | $ | |
| 90-99 | $ | $ |
| Total | 15 |
Step 2: The ranges 60-69 and 70-79 are tied with 4 students each.
Answer: The most common score ranges are 60-69 and 70-79 (4 students each). Most students scored between 60 and 79.
Example 4: Identifying Bias in a Survey (Medium)
Problem: A student wants to know the most popular after-school activity at their school of 500 students. They survey 10 students from the basketball team. Is this a good survey?
Step 1 — Sample size: 10 out of 500 is only 2% — too small.
Step 2 — Representativeness: All 10 are from the basketball team. They are much more likely to say "Sports" than students from music club or art class.
Step 3 — Conclusion: This survey is biased because:
- The sample is too small
- The sample is not representative of the whole school
Answer: This is a biased survey. A better approach would be to randomly select at least 50 students from different clubs and grades.
Example 5: Comparing Two Data Sets (Challenging)
Problem: Class A test scores: 72, 78, 80, 85, 90. Class B test scores: 60, 75, 82, 88, 95. Compare the two classes using mean and range.
Step 1 — Class A mean:
Step 2 — Class B mean:
Step 3 — Ranges:
Step 4 — Analysis: The means are nearly equal (81 vs 80), so both classes performed similarly on average. However, Class B's range (35) is almost double Class A's range (18), meaning Class B's scores are much more spread out — some students did very well while others struggled.
Answer: Both classes have a similar average (), but Class B has a much wider range (35 vs 18), indicating more variation in student performance.
Common Mistakes
Mistake 1: Using a biased sample and thinking the results are reliable
❌ Surveying only your friends about the best school lunch and presenting the results as "what everyone thinks."
✅ Use a random sample that includes students from different grades, interests, and backgrounds.
Why this matters: Biased samples produce results that only represent one group, not the whole population.
Mistake 2: Confusing frequency with relative frequency
In a survey of 50 people, 15 chose "Math."
❌ "The relative frequency of Math is 15."
✅ The frequency is 15. The relative frequency is .
Why this matters: Frequency is a count. Relative frequency is a proportion (fraction, decimal, or percentage). They answer different questions.
Mistake 3: Choosing wrong group intervals for numerical data
Data ranges from 40 to 100.
❌ Using groups 40-50, 50-60, 60-70... (overlapping — does 50 go in the first or second group?).
✅ Use non-overlapping groups: 40-49, 50-59, 60-69, 70-79, 80-89, 90-99.
Why this matters: Overlapping intervals mean some values could be counted twice, making your frequency table inaccurate.
Practice Problems
Try these on your own before checking the answers:
- Classify each as categorical or numerical: (a) Favorite color (b) Number of books read (c) Shoe size (d) Type of pet.
- A survey of 25 students shows: Soccer = 8, Basketball = 6, Tennis = 4, Swimming = 7. What is the relative frequency of Swimming?
- Create a grouped frequency table for these ages: 11, 14, 12, 16, 13, 15, 12, 11, 14, 13, 16, 15, 12, 14, 11. Use groups 11-12, 13-14, 15-16.
- A researcher wants to know if students prefer online or in-person learning. They only survey students in the computer lab. Is this sample biased? Explain.
- Data set A: 10, 10, 10, 10, 10 (mean = 10, range = 0). Data set B: 2, 6, 10, 14, 18 (mean = 10, range = 16). Both have the same mean but very different ranges. What does this tell you?
Click to see answers
- (a) Categorical (b) Numerical (discrete) (c) Numerical (d) Categorical.
- Relative frequency .
- Ages 11-12: frequency 6. Ages 13-14: frequency 5. Ages 15-16: frequency 4. Total: 15.
- Yes, it is biased. Students in the computer lab are more likely to prefer online learning since they are already comfortable with computers.
- While both sets have the same average, Set A has no variation (all values are identical) while Set B has values spread far from the mean. The range reveals differences that the mean alone cannot show.
Summary
- Data can be categorical (qualities) or numerical (numbers, either discrete or continuous).
- Common collection methods: surveys, observations, experiments, existing records.
- A good survey uses a large, representative, unbiased sample with clear questions.
- Frequency tables organize raw data by counting occurrences; relative frequency expresses counts as proportions.
- Use group intervals for numerical data when there are many distinct values — make sure intervals do not overlap.
- Always compare data sets using multiple measures (mean, range) for a complete picture.
Related Topics
- Statistics Basics — Mean, Median, Mode, and Data Analysis
- Mean, Median, Mode, and Range — How to Find Each One
- How to Read and Interpret Bar Charts and Pie Charts
Need help with data analysis problems?
Take a photo of your math problem and MathPal will solve it step by step.