📚

Created byNikki Walworth

10 views1 downloads

Data Driven: Analyzing the Statistics of Our Daily Lives

Grade 10Math18 days

In this 10th-grade mathematics project, students act as data analysts to investigate and reveal the "real story" behind trends within their own communities. By collecting original data on topics such as local prices or social media usage, students move beyond simple calculations to construct sophisticated visual representations and apply complex statistical models, including the Normal Distribution and z-scores. The experience culminates in an "Ethical Analyst's Final Brief," a multimedia report where students must use mathematical evidence to communicate findings accurately while navigating the challenges of data manipulation. Through this process, students learn to use statistics as a powerful tool for objective storytelling and community advocacy.

Statistical ModelingData VisualizationNormal DistributionCommunity AnalysisEthical CommunicationZ-ScoresOutliers

Want to create your own PBL Recipe?Use our AI-powered tools to design engaging project-based learning experiences for your students.

📝

Inquiry Framework

Question Framework

Driving Question

The overarching question that guides the entire project.As data analysts, how can we use statistical models and visual representations to reveal the "real story" behind a trend in our community and communicate our findings in an ethical, compelling way?

Essential Questions

Supporting questions that break down major concepts.

How can different visual representations (dot plots, histograms, and box plots) reveal different perspectives or "truths" about a dataset?
In what ways do measures of center (mean, median) and spread (interquartile range, standard deviation) help us summarize complex real-world trends?
How can the Normal Distribution model and z-scores help us determine how "extreme" or "typical" a specific data point is within our lives?
When comparing two sets of data, how do we use mathematical evidence to prove that the differences between them are significant rather than random?
How do outliers impact our interpretation of data, and how should a responsible data analyst decide whether to include or exclude them from a report?
How can data be manipulated or misrepresented in the media, and how can we use statistics to communicate our findings ethically and accurately?

Standards & Learning Goals

Learning Goals

By the end of this project, students will be able to:

Construct and interpret graphical representations (dot plots, histograms, and box plots) to accurately describe the distribution and shape of community-based data.
Analyze and compare multiple datasets using measures of center (mean, median) and spread (interquartile range, standard deviation) to derive evidence-based conclusions.
Apply properties of the Normal Distribution and calculate z-scores to evaluate the relative standing and "typicality" of specific data points within a population.
Identify and assess the impact of outliers on statistical measures to determine appropriate methods for data reporting and ethical representation.
Communicate complex statistical findings through compelling narratives, ensuring that visual and numerical data are used ethically to avoid misrepresentation.

Common Core State Standards for Mathematics

CCSS.Math.Content.HSS.ID.A.1

Primary

Represent data with plots on the real number line (dot plots, histograms, and box plots).Reason: This standard is foundational to the project's goal of using visual representations to reveal the 'real story' behind community trends.

CCSS.Math.Content.HSS.ID.A.2

Primary

Use statistics appropriate to the shape of the data distribution to compare center (median, mean) and spread (interquartile range, standard deviation) of two or more different data sets.Reason: Students will directly use these measures to summarize complex trends and compare different sets of community data as part of their analysis.

CCSS.Math.Content.HSS.ID.A.3

Primary

Interpret differences in shape, center, and spread in the context of the data sets, accounting for possible effects of extreme data points (outliers).Reason: A key component of the inquiry framework is understanding how outliers and data shape impact the interpretation and 'truth' of a dataset.

CCSS.Math.Content.HSS.ID.A.4

Primary

Use the mean and standard deviation of a data set to fit it to a normal distribution and to estimate population percentages. Recognize that there are data sets for which such a procedure is not appropriate. Use calculators, spreadsheets, and tables to estimate areas under the normal curve.Reason: This aligns with the essential question regarding how the Normal Distribution and z-scores help determine how 'extreme' or 'typical' a data point is.

CCSS.Math.Content.HSS.ID.B.5

Secondary

Summarize categorical data for two categories in two-way frequency tables. Interpret relative frequencies in the context of the data (including joint, marginal, and conditional relative frequencies). Recognize possible associations and trends in the data.Reason: While the project focuses on one-variable statistics, comparing community groups often requires summarizing categorical data through tables to identify trends.

CCSS.Math.Content.HSS.ID.B.6

Supporting

Represent data on two quantitative variables on a scatter plot, and describe how the variables are related.Reason: Though the project's primary focus is one-variable statistics, students may explore relationships between variables to provide deeper context for their community reports.

CCSS.Math.Content.HSS.ID.C.7

Supporting

Interpret the slope (rate of change) and the intercept (constant term) of a linear model in the context of the data.Reason: This provides a foundation for ethical communication if students choose to use linear models to forecast community trends.

Common Core State Standards for Mathematical Practice

CCSS.Math.Practice.MP4

Primary

Model with mathematics.Reason: As 'data analysts,' students are applying mathematical tools to solve real-world problems and represent community trends, which is the core of mathematical modeling.

Entry Events

Events that will be used to introduce the project to students

Dream Job Reality Check: The Median vs. The Myth

Students are given 'Salary Envelopes' for various 'Dream Jobs' (Pro Gamer, Influencer, Nurse, Software Engineer) that contain raw data sets of actual earnings across the industry. They must investigate why the 'Average Salary' advertised on recruiting websites often looks nothing like the 'Median' reality, leading to a discussion on how skewed data can influence their major life decisions and career paths.

📚

Portfolio Activities

These activities progressively build towards your learning goals, with each submission contributing to the student's final portfolio.

Activity 1

The Data Snapshot: Triple-Threat Visuals

In this introductory activity, students will transition from the 'Dream Job' entry event to collecting their own community-relevant data. Students will select a single quantitative variable that impacts their daily lives (e.g., minutes spent on social media, price of a gallon of gas at different local stations, or daily steps). They will create three different visual representations of this data to see how the 'story' changes depending on the visual format used.

Steps

Here is some basic scaffolding to help students complete the activity.

1. Identify a community-based quantitative variable and collect at least 30 data points (e.g., survey classmates, observe local traffic, or research local prices).

2. Construct a dot plot to identify individual data points and potential clusters.

3. Group the data into appropriate intervals and create a histogram to visualize the overall frequency distribution.

4. Determine the five-number summary (min, Q1, median, Q3, max) and construct a box plot to highlight the spread and potential outliers.

Final Product

What students will submit as the final product of the activityA 'Visual Trio' Infographic featuring a dot plot, a histogram, and a box plot of their chosen community dataset, accompanied by a brief reflection on which visual provides the clearest picture of the data's distribution.

Alignment

How this activity aligns with the learning objectives & standardsAligns with CCSS.Math.Content.HSS.ID.A.1: Represent data with plots on the real number line (dot plots, histograms, and box plots). This activity focuses on the technical skill of data visualization as a means to observe initial patterns.

Activity 2

The Great Comparison: Decoding Center and Spread

Now that students can visualize data, they must learn to describe it mathematically. In this activity, students compare their community dataset with a partner's or a contrasting group (e.g., 'Student Sleep Times' vs. 'Recommended Sleep Times'). They will calculate measures of center and spread and determine which measures are most 'honest' based on the presence of skewness or outliers.

Steps

Here is some basic scaffolding to help students complete the activity.

1. Calculate the mean and standard deviation for your dataset using a calculator or spreadsheet.

2. Calculate the median and Interquartile Range (IQR) for the same dataset.

3. Identify any outliers using the 1.5 x IQR rule and describe how they 'pull' the mean or distort the standard deviation.

4. Write a comparison paragraph explaining which dataset (yours or your partner's) has more variability and which measure of center (mean or median) is the most accurate representation of the 'typical' experience.

Final Product

What students will submit as the final product of the activityA 'Statistical Face-Off' Comparative Report that uses calculated measures (Mean, Median, SD, IQR) to prove which group's data is more consistent or more extreme.

Alignment

How this activity aligns with the learning objectives & standardsAligns with CCSS.Math.Content.HSS.ID.A.2 and HSS.ID.A.3. It requires students to use appropriate statistics (mean/SD vs. median/IQR) based on data shape and to interpret the impact of outliers on these measures.

Activity 3

The 'Normal' Test: Are You an Outlier?

In this activity, students investigate whether their community data follows the 'Bell Curve.' They will use their previously calculated mean and standard deviation to model their data as a Normal Distribution. They will then calculate z-scores for specific, interesting data points (like a very high gas price or a very low sleep time) to determine exactly how 'rare' or 'typical' those points are in a broader context.

Steps

Here is some basic scaffolding to help students complete the activity.

1. Test your data for 'normality' by checking if it is roughly symmetric and bell-shaped.

2. Draw a normal curve labeled with the mean and three standard deviations in both directions.

3. Select three interesting data points from your set and calculate their z-scores using the formula (x - mean) / SD.

4. Use a z-table or spreadsheet function (NORM.S.DIST) to find the percentile for these points, explaining what percentage of the community falls below these values.

Final Product

What students will submit as the final product of the activityA 'Typicality Profile' consisting of a labeled Normal Distribution curve and a series of z-score calculations that categorize specific data points as 'Average,' 'Unusual,' or 'Extreme.'

Alignment

How this activity aligns with the learning objectives & standardsAligns with CCSS.Math.Content.HSS.ID.A.4. This activity specifically addresses fitting data to a normal distribution, calculating z-scores, and using technology to find areas under the curve.

Activity 4

The Final Brief: Telling the Ethical Story

In this final portfolio activity, students look for associations between their quantitative data and a categorical variable (e.g., 'Gender' and 'Favorite Social Media App' or 'Neighborhood' and 'Commute Type'). They will build two-way frequency tables to find trends and then synthesize all their findings into an ethical data report. They must address the Essential Question: How can data be manipulated, and how will I ensure my report is 'The Real Story'?

Steps

Here is some basic scaffolding to help students complete the activity.

1. Create a two-way frequency table to compare your primary data against a categorical variable (e.g., grade level or location).

2. Calculate marginal and conditional relative frequencies to identify any associations or trends.

3. Review your entire analysis (Visuals, Center/Spread, Normal Model) and identify one way the data *could* be misrepresented to tell a false story.

4. Draft your final brief, clearly stating your evidence-based conclusion and explaining the ethical choices you made in representing the data.

Final Product

What students will submit as the final product of the activityThe 'Ethical Analyst's Final Brief'—a multimedia report (presentation or digital document) that combines visuals, summary statistics, and categorical trends to tell the 'real story' of a community issue while explicitly avoiding common statistical pitfalls.

Alignment

How this activity aligns with the learning objectives & standardsAligns with CCSS.Math.Content.HSS.ID.B.5 and HSS.ID.C.9. It focuses on summarizing categorical data in two-way tables and the ethical responsibility of a data analyst to avoid misrepresentation.

🏆

Rubric & Reflection

Portfolio Rubric

Grading criteria for assessing the overall project portfolio

Community Data Analyst Portfolio Rubric

Category 1

Data Visualization & Representation (HSS.ID.A.1)

Assesses the technical construction and analytical interpretation of one-variable data plots.

Criterion 1

Data Visualization Accuracy

Accuracy and clarity of data visualizations (dot plots, histograms, and box plots) constructed from community-sourced data.

Exemplary

4 Points

All three visual representations (dot plot, histogram, box plot) are precisely constructed with appropriate scales, clear labels, and innovative design. The reflection provides a sophisticated analysis of how each format uniquely reveals the data's story.

Proficient

3 Points

All three visual representations are accurately constructed with correct labels and scales. The reflection clearly explains which visual provides the best picture of the distribution.

Developing

2 Points

Visual representations are mostly accurate, but may have minor scaling errors or missing labels. The reflection provides a basic or inconsistent explanation of the data's distribution.

Beginning

1 Points

Visual representations are incomplete, inaccurate, or missing. There is little to no reflection on the distribution of the data.

Criterion 2

Visual Strategy & Analysis

The ability to justify the choice of visual based on the data's shape and distribution.

Exemplary

4 Points

Demonstrates a sophisticated understanding of how data shape (skewness, clusters) dictates the effectiveness of specific visuals; identifies subtle patterns others might miss.

Proficient

3 Points

Correctly identifies the shape of the data and provides a logical reason for why one visual representation is superior for the specific dataset.

Developing

2 Points

Identifies the shape of the data but struggles to explain why a specific visual is more or less effective for the context.

Beginning

1 Points

Cannot identify the shape of the distribution or explain the utility of different visual representations.

Category 2

Statistical Summarization & Comparison (HSS.ID.A.2, HSS.ID.A.3)

Focuses on the mathematical summarization of data and the ability to compare different sets of information.

Criterion 1

Computational Accuracy & Outlier Analysis

Calculation and application of mean, median, standard deviation, and interquartile range (IQR) to summarize data.

Exemplary

4 Points

Calculations are flawless. Provides a deep analysis of how outliers specifically 'pull' the mean or distort standard deviation, using precise mathematical language.

Proficient

3 Points

Calculations for mean, median, SD, and IQR are accurate. Correctly applies the 1.5 x IQR rule to identify outliers.

Developing

2 Points

Calculations are mostly correct but may contain minor computational errors. Outliers are identified but the 1.5 x IQR rule may be applied incorrectly.

Beginning

1 Points

Calculations are missing or contain significant errors. Fails to identify outliers or use appropriate measures of center/spread.

Criterion 2

Comparative Statistical Reasoning

Comparing two datasets to draw evidence-based conclusions about variability and the 'typical' experience.

Exemplary

4 Points

Provides a compelling, evidence-based argument comparing datasets, showing an advanced understanding of how variability impacts the 'real-world' experience of the community.

Proficient

3 Points

Uses calculated measures (Mean vs. Median) to correctly justify which group's data is more consistent or extreme. Conclusion is based on mathematical evidence.

Developing

2 Points

Comparison is present but relies more on intuition than mathematical evidence. Conclusion about which group is 'typical' is vague.

Beginning

1 Points

Does not compare datasets or uses incorrect measures to make a comparison. Conclusion is not supported by data.

Category 3

The Normal Model (HSS.ID.A.4)

Evaluates the student's ability to use the Normal Distribution to assess how data points relate to the population.

Criterion 1

Normal Modeling & Z-Scores

Modeling data using the normal distribution curve and calculating relative standing via z-scores.

Exemplary

4 Points

Constructs a perfectly labeled normal curve and calculates z-scores with 100% accuracy. Provides a sophisticated interpretation of percentiles in the context of the community.

Proficient

3 Points

Constructs a normal curve labeled with mean and standard deviations. Correctly calculates z-scores for three data points and finds corresponding percentiles.

Developing

2 Points

Normal curve is drawn but may be mislabeled. Z-score calculations contain errors or use the formula incorrectly. Percentile interpretation is limited.

Beginning

1 Points

Fails to model data as a normal distribution. Z-score calculations are missing or fundamentally misunderstood.

Criterion 2

Interpretation of Relative Standing

Interpreting 'typicality' and 'extremity' within a real-world context using statistical benchmarks.

Exemplary

4 Points

Evaluates 'typicality' with nuanced insight, considering the limitations of the normal model for the specific dataset provided.

Proficient

3 Points

Accurately categorizes data points as 'Average,' 'Unusual,' or 'Extreme' based on calculated z-scores and normal distribution properties.

Developing

2 Points

Categorizes data points but the reasoning is not consistently tied to z-score values or standard deviations.

Beginning

1 Points

Cannot determine if a data point is 'rare' or 'typical' using statistical evidence.

Category 4

Ethical Analysis & Synthesis (HSS.ID.B.5, HSS.ID.C.9)

Assesses the final synthesis of data, including categorical associations and the ethical communication of findings.

Criterion 1

Categorical Trend Analysis

Constructing and interpreting two-way frequency tables to find associations between variables.

Exemplary

4 Points

Constructs a flawless two-way table and provides a deep analysis of conditional frequencies that reveals hidden associations or trends within the community data.

Proficient

3 Points

Correctly constructs a two-way frequency table and calculates marginal and conditional relative frequencies to identify a clear trend.

Developing

2 Points

Two-way table is present but contains errors in frequency counts or relative frequency calculations. Identification of trends is weak.

Beginning

1 Points

Two-way table is missing or incorrectly structured. Fails to identify any associations between variables.

Criterion 2

Ethical Data Communication

Communicating findings ethically and identifying potential for data misrepresentation.

Exemplary

4 Points

Masterfully synthesizes data into a professional report. Explicitly identifies multiple ways the data could be manipulated and explains how their own choices ensured an ethical 'Real Story.'

Proficient

3 Points

Communicates findings clearly through a multimedia report. Identifies at least one potential misrepresentation and provides an evidence-based conclusion.

Developing

2 Points

Report is presented but lacks a clear connection between evidence and conclusions. Recognition of ethical pitfalls is superficial.

Beginning

1 Points

Final brief is incomplete or presents data in a misleading way. No mention of ethical considerations or potential misrepresentation.

Reflection Prompts

End-of-project reflection questions to get students to think about their learning

Question 1

Reflecting on your role as a data analyst, how did using different visual representations (dot plots vs. histograms vs. box plots) change your own perspective or 'truth' about the community data you collected?

Text

Required

Question 2

When faced with a skewed dataset or one with extreme outliers, how confident do you feel in choosing and justifying the most 'honest' measure of center and spread to represent the 'typical' experience?

Scale

Required

Question 3

As a data analyst, which statistical challenge did you find most difficult to navigate while trying to ensure your final brief told an 'ethical' story?

Multiple choice

Required

Options

Choosing a visual representation that emphasizes a specific bias

Using the mean instead of the median for a heavily skewed dataset

Deciding whether to include or exclude extreme outliers in the final report

Ensuring the sample size was large enough to make a valid claim

Question 4

How did calculating z-scores for specific data points (like gas prices or sleep times) change your understanding of what is 'typical' versus 'extreme' in your own life? Provide one specific example from your data.

Text

Required

Question 5

To what extent do you agree with this statement: 'Using statistical models allows me to see my community more clearly and advocate for change more effectively.'

Scale

Optional