Skip to main content

Data Collection and Analysis — Tables, Charts, and Surveys

Grade: 6-7 | Topic: Statistics

What You Will Learn

After this lesson you will know how data is collected through surveys, observations, and experiments. You will be able to organize raw data into frequency tables and tally charts, identify the difference between categorical and numerical data, and draw conclusions by analyzing organized data. These skills form the foundation for all statistics work.

Theory

Types of Data

Before collecting data, you need to understand what kind of data you are working with.

Categorical data (also called qualitative data) describes qualities or groups:

  • Favorite subject: Math, Science, English
  • Type of pet: Dog, Cat, Fish
  • Transportation mode: Bus, Walk, Car

Numerical data (also called quantitative data) consists of numbers:

  • Test scores: 75, 82, 91
  • Heights in cm: 148, 155, 162
  • Number of siblings: 0, 1, 2, 3

Numerical data can be further divided:

  • Discrete — countable, whole numbers (number of students, number of pets)
  • Continuous — measurable, can include decimals (height, weight, time)

Methods of Data Collection

MethodWhat It IsExample
Survey / QuestionnaireAsk people questions"What is your favorite sport?"
ObservationWatch and record what happensCounting cars at an intersection
ExperimentTest something under controlled conditionsMeasuring plant growth with different amounts of water
Existing dataUse records that already existCensus data, school attendance records

Designing a Good Survey

A good survey gives reliable results. Follow these rules:

  1. Use a large enough sample — asking 5 people is not enough; ask at least 30 if possible.
  2. Choose a representative sample — make sure the people you ask reflect the whole group.
  3. Avoid leading questions — "Don't you think pizza is the best lunch?" pushes people toward a specific answer.
  4. Keep questions clear — avoid double-barreled questions like "Do you like math and science?"
  5. Use closed-ended questions for easy analysis — "What is your favorite subject: Math, Science, English, or Other?"

Organizing Data: Tally Charts and Frequency Tables

Raw data is hard to interpret. Organizing it makes patterns visible.

Tally chart: Uses tally marks (groups of 5, written as four strokes with a diagonal cross-stroke) to count occurrences as you collect data.

Frequency table: Lists each category or value with its frequency (count).

Value/CategoryTallyFrequency
Category AIIIII II7
Category BIIIII5
Category CIII3
Total15

Relative frequency shows each category as a fraction or percentage of the total:

Relative frequency=Frequency of categoryTotal frequency\text{Relative frequency} = \frac{\text{Frequency of category}}{\text{Total frequency}}

Drawing Conclusions from Data

Once data is organized, you can:

  1. Identify the mode — the category or value with the highest frequency
  2. Calculate the mean — the average of numerical data
  3. Find the range — the spread from smallest to largest
  4. Spot trends — patterns or changes over time
  5. Compare groups — see if one group differs from another

Worked Examples

Example 1: Building a Frequency Table (Easy)

Problem: A teacher asked 20 students about their favorite fruit. The responses were: Apple, Banana, Apple, Orange, Banana, Apple, Grape, Banana, Apple, Orange, Banana, Grape, Apple, Banana, Orange, Apple, Banana, Orange, Grape, Apple.

Build a frequency table and find the mode.

Step 1: Count each fruit:

FruitFrequency
Apple7
Banana6
Orange4
Grape3
Total20

Step 2: The mode is the fruit with the highest frequency.

Answer: The mode is Apple (frequency 7). Apple is the most popular fruit.

Example 2: Calculating Relative Frequency (Easy)

Problem: Using the fruit data above, what is the relative frequency of Banana?

Step 1: Apply the formula:

Relative frequency of Banana=620=310=0.3=30%\text{Relative frequency of Banana} = \frac{6}{20} = \frac{3}{10} = 0.3 = 30\%

Answer: The relative frequency of Banana is 30% — almost one-third of the students prefer bananas.

Example 3: Grouped Frequency Table for Numerical Data (Medium)

Problem: Test scores for 15 students: 52, 67, 71, 45, 88, 73, 61, 95, 78, 82, 56, 69, 84, 77, 63. Organize into groups of 10 (40-49, 50-59, etc.) and identify which range has the most students.

Step 1: Sort into groups:

Score RangeTallyFrequency
40-49$$
50-59$
60-69$
70-79$
80-89$
90-99$$
Total15

Step 2: The ranges 60-69 and 70-79 are tied with 4 students each.

Answer: The most common score ranges are 60-69 and 70-79 (4 students each). Most students scored between 60 and 79.

Example 4: Identifying Bias in a Survey (Medium)

Problem: A student wants to know the most popular after-school activity at their school of 500 students. They survey 10 students from the basketball team. Is this a good survey?

Step 1 — Sample size: 10 out of 500 is only 2% — too small.

Step 2 — Representativeness: All 10 are from the basketball team. They are much more likely to say "Sports" than students from music club or art class.

Step 3 — Conclusion: This survey is biased because:

  • The sample is too small
  • The sample is not representative of the whole school

Answer: This is a biased survey. A better approach would be to randomly select at least 50 students from different clubs and grades.

Example 5: Comparing Two Data Sets (Challenging)

Problem: Class A test scores: 72, 78, 80, 85, 90. Class B test scores: 60, 75, 82, 88, 95. Compare the two classes using mean and range.

Step 1 — Class A mean:

MeanA=72+78+80+85+905=4055=81\text{Mean}_A = \frac{72 + 78 + 80 + 85 + 90}{5} = \frac{405}{5} = 81

Step 2 — Class B mean:

MeanB=60+75+82+88+955=4005=80\text{Mean}_B = \frac{60 + 75 + 82 + 88 + 95}{5} = \frac{400}{5} = 80

Step 3 — Ranges:

RangeA=9072=18\text{Range}_A = 90 - 72 = 18

RangeB=9560=35\text{Range}_B = 95 - 60 = 35

Step 4 — Analysis: The means are nearly equal (81 vs 80), so both classes performed similarly on average. However, Class B's range (35) is almost double Class A's range (18), meaning Class B's scores are much more spread out — some students did very well while others struggled.

Answer: Both classes have a similar average (80\approx 80), but Class B has a much wider range (35 vs 18), indicating more variation in student performance.

Common Mistakes

Mistake 1: Using a biased sample and thinking the results are reliable

❌ Surveying only your friends about the best school lunch and presenting the results as "what everyone thinks."

✅ Use a random sample that includes students from different grades, interests, and backgrounds.

Why this matters: Biased samples produce results that only represent one group, not the whole population.

Mistake 2: Confusing frequency with relative frequency

In a survey of 50 people, 15 chose "Math."

❌ "The relative frequency of Math is 15."

✅ The frequency is 15. The relative frequency is 1550=0.30=30%\frac{15}{50} = 0.30 = 30\%.

Why this matters: Frequency is a count. Relative frequency is a proportion (fraction, decimal, or percentage). They answer different questions.

Mistake 3: Choosing wrong group intervals for numerical data

Data ranges from 40 to 100.

❌ Using groups 40-50, 50-60, 60-70... (overlapping — does 50 go in the first or second group?).

✅ Use non-overlapping groups: 40-49, 50-59, 60-69, 70-79, 80-89, 90-99.

Why this matters: Overlapping intervals mean some values could be counted twice, making your frequency table inaccurate.

Practice Problems

Try these on your own before checking the answers:

  1. Classify each as categorical or numerical: (a) Favorite color (b) Number of books read (c) Shoe size (d) Type of pet.
  2. A survey of 25 students shows: Soccer = 8, Basketball = 6, Tennis = 4, Swimming = 7. What is the relative frequency of Swimming?
  3. Create a grouped frequency table for these ages: 11, 14, 12, 16, 13, 15, 12, 11, 14, 13, 16, 15, 12, 14, 11. Use groups 11-12, 13-14, 15-16.
  4. A researcher wants to know if students prefer online or in-person learning. They only survey students in the computer lab. Is this sample biased? Explain.
  5. Data set A: 10, 10, 10, 10, 10 (mean = 10, range = 0). Data set B: 2, 6, 10, 14, 18 (mean = 10, range = 16). Both have the same mean but very different ranges. What does this tell you?
Click to see answers
  1. (a) Categorical (b) Numerical (discrete) (c) Numerical (d) Categorical.
  2. Relative frequency =725=0.28=28%= \frac{7}{25} = 0.28 = 28\%.
  3. Ages 11-12: frequency 6. Ages 13-14: frequency 5. Ages 15-16: frequency 4. Total: 15.
  4. Yes, it is biased. Students in the computer lab are more likely to prefer online learning since they are already comfortable with computers.
  5. While both sets have the same average, Set A has no variation (all values are identical) while Set B has values spread far from the mean. The range reveals differences that the mean alone cannot show.

Summary

  • Data can be categorical (qualities) or numerical (numbers, either discrete or continuous).
  • Common collection methods: surveys, observations, experiments, existing records.
  • A good survey uses a large, representative, unbiased sample with clear questions.
  • Frequency tables organize raw data by counting occurrences; relative frequency expresses counts as proportions.
  • Use group intervals for numerical data when there are many distinct values — make sure intervals do not overlap.
  • Always compare data sets using multiple measures (mean, range) for a complete picture.

Need help with data analysis problems?

Take a photo of your math problem and MathPal will solve it step by step.

Open MathPal