Data Driven: Analyzing the Statistics of Our Daily Lives
Inquiry Framework
Question Framework
Driving Question
The overarching question that guides the entire project.As data analysts, how can we use statistical models and visual representations to reveal the "real story" behind a trend in our community and communicate our findings in an ethical, compelling way?Essential Questions
Supporting questions that break down major concepts.- How can different visual representations (dot plots, histograms, and box plots) reveal different perspectives or "truths" about a dataset?
- In what ways do measures of center (mean, median) and spread (interquartile range, standard deviation) help us summarize complex real-world trends?
- How can the Normal Distribution model and z-scores help us determine how "extreme" or "typical" a specific data point is within our lives?
- When comparing two sets of data, how do we use mathematical evidence to prove that the differences between them are significant rather than random?
- How do outliers impact our interpretation of data, and how should a responsible data analyst decide whether to include or exclude them from a report?
- How can data be manipulated or misrepresented in the media, and how can we use statistics to communicate our findings ethically and accurately?
Standards & Learning Goals
Learning Goals
By the end of this project, students will be able to:- Construct and interpret graphical representations (dot plots, histograms, and box plots) to accurately describe the distribution and shape of community-based data.
- Analyze and compare multiple datasets using measures of center (mean, median) and spread (interquartile range, standard deviation) to derive evidence-based conclusions.
- Apply properties of the Normal Distribution and calculate z-scores to evaluate the relative standing and "typicality" of specific data points within a population.
- Identify and assess the impact of outliers on statistical measures to determine appropriate methods for data reporting and ethical representation.
- Communicate complex statistical findings through compelling narratives, ensuring that visual and numerical data are used ethically to avoid misrepresentation.
Common Core State Standards for Mathematics
Common Core State Standards for Mathematical Practice
Entry Events
Events that will be used to introduce the project to studentsDream Job Reality Check: The Median vs. The Myth
Students are given 'Salary Envelopes' for various 'Dream Jobs' (Pro Gamer, Influencer, Nurse, Software Engineer) that contain raw data sets of actual earnings across the industry. They must investigate why the 'Average Salary' advertised on recruiting websites often looks nothing like the 'Median' reality, leading to a discussion on how skewed data can influence their major life decisions and career paths.Portfolio Activities
Portfolio Activities
These activities progressively build towards your learning goals, with each submission contributing to the student's final portfolio.The Data Snapshot: Triple-Threat Visuals
In this introductory activity, students will transition from the 'Dream Job' entry event to collecting their own community-relevant data. Students will select a single quantitative variable that impacts their daily lives (e.g., minutes spent on social media, price of a gallon of gas at different local stations, or daily steps). They will create three different visual representations of this data to see how the 'story' changes depending on the visual format used.Steps
Here is some basic scaffolding to help students complete the activity.Final Product
What students will submit as the final product of the activityA 'Visual Trio' Infographic featuring a dot plot, a histogram, and a box plot of their chosen community dataset, accompanied by a brief reflection on which visual provides the clearest picture of the data's distribution.Alignment
How this activity aligns with the learning objectives & standardsAligns with CCSS.Math.Content.HSS.ID.A.1: Represent data with plots on the real number line (dot plots, histograms, and box plots). This activity focuses on the technical skill of data visualization as a means to observe initial patterns.The Great Comparison: Decoding Center and Spread
Now that students can visualize data, they must learn to describe it mathematically. In this activity, students compare their community dataset with a partner's or a contrasting group (e.g., 'Student Sleep Times' vs. 'Recommended Sleep Times'). They will calculate measures of center and spread and determine which measures are most 'honest' based on the presence of skewness or outliers.Steps
Here is some basic scaffolding to help students complete the activity.Final Product
What students will submit as the final product of the activityA 'Statistical Face-Off' Comparative Report that uses calculated measures (Mean, Median, SD, IQR) to prove which group's data is more consistent or more extreme.Alignment
How this activity aligns with the learning objectives & standardsAligns with CCSS.Math.Content.HSS.ID.A.2 and HSS.ID.A.3. It requires students to use appropriate statistics (mean/SD vs. median/IQR) based on data shape and to interpret the impact of outliers on these measures.The 'Normal' Test: Are You an Outlier?
In this activity, students investigate whether their community data follows the 'Bell Curve.' They will use their previously calculated mean and standard deviation to model their data as a Normal Distribution. They will then calculate z-scores for specific, interesting data points (like a very high gas price or a very low sleep time) to determine exactly how 'rare' or 'typical' those points are in a broader context.Steps
Here is some basic scaffolding to help students complete the activity.Final Product
What students will submit as the final product of the activityA 'Typicality Profile' consisting of a labeled Normal Distribution curve and a series of z-score calculations that categorize specific data points as 'Average,' 'Unusual,' or 'Extreme.'Alignment
How this activity aligns with the learning objectives & standardsAligns with CCSS.Math.Content.HSS.ID.A.4. This activity specifically addresses fitting data to a normal distribution, calculating z-scores, and using technology to find areas under the curve.The Final Brief: Telling the Ethical Story
In this final portfolio activity, students look for associations between their quantitative data and a categorical variable (e.g., 'Gender' and 'Favorite Social Media App' or 'Neighborhood' and 'Commute Type'). They will build two-way frequency tables to find trends and then synthesize all their findings into an ethical data report. They must address the Essential Question: How can data be manipulated, and how will I ensure my report is 'The Real Story'?Steps
Here is some basic scaffolding to help students complete the activity.Final Product
What students will submit as the final product of the activityThe 'Ethical Analyst's Final Brief'—a multimedia report (presentation or digital document) that combines visuals, summary statistics, and categorical trends to tell the 'real story' of a community issue while explicitly avoiding common statistical pitfalls.Alignment
How this activity aligns with the learning objectives & standardsAligns with CCSS.Math.Content.HSS.ID.B.5 and HSS.ID.C.9. It focuses on summarizing categorical data in two-way tables and the ethical responsibility of a data analyst to avoid misrepresentation.Rubric & Reflection
Portfolio Rubric
Grading criteria for assessing the overall project portfolioCommunity Data Analyst Portfolio Rubric
Data Visualization & Representation (HSS.ID.A.1)
Assesses the technical construction and analytical interpretation of one-variable data plots.Data Visualization Accuracy
Accuracy and clarity of data visualizations (dot plots, histograms, and box plots) constructed from community-sourced data.
Exemplary
4 PointsAll three visual representations (dot plot, histogram, box plot) are precisely constructed with appropriate scales, clear labels, and innovative design. The reflection provides a sophisticated analysis of how each format uniquely reveals the data's story.
Proficient
3 PointsAll three visual representations are accurately constructed with correct labels and scales. The reflection clearly explains which visual provides the best picture of the distribution.
Developing
2 PointsVisual representations are mostly accurate, but may have minor scaling errors or missing labels. The reflection provides a basic or inconsistent explanation of the data's distribution.
Beginning
1 PointsVisual representations are incomplete, inaccurate, or missing. There is little to no reflection on the distribution of the data.
Visual Strategy & Analysis
The ability to justify the choice of visual based on the data's shape and distribution.
Exemplary
4 PointsDemonstrates a sophisticated understanding of how data shape (skewness, clusters) dictates the effectiveness of specific visuals; identifies subtle patterns others might miss.
Proficient
3 PointsCorrectly identifies the shape of the data and provides a logical reason for why one visual representation is superior for the specific dataset.
Developing
2 PointsIdentifies the shape of the data but struggles to explain why a specific visual is more or less effective for the context.
Beginning
1 PointsCannot identify the shape of the distribution or explain the utility of different visual representations.
Statistical Summarization & Comparison (HSS.ID.A.2, HSS.ID.A.3)
Focuses on the mathematical summarization of data and the ability to compare different sets of information.Computational Accuracy & Outlier Analysis
Calculation and application of mean, median, standard deviation, and interquartile range (IQR) to summarize data.
Exemplary
4 PointsCalculations are flawless. Provides a deep analysis of how outliers specifically 'pull' the mean or distort standard deviation, using precise mathematical language.
Proficient
3 PointsCalculations for mean, median, SD, and IQR are accurate. Correctly applies the 1.5 x IQR rule to identify outliers.
Developing
2 PointsCalculations are mostly correct but may contain minor computational errors. Outliers are identified but the 1.5 x IQR rule may be applied incorrectly.
Beginning
1 PointsCalculations are missing or contain significant errors. Fails to identify outliers or use appropriate measures of center/spread.
Comparative Statistical Reasoning
Comparing two datasets to draw evidence-based conclusions about variability and the 'typical' experience.
Exemplary
4 PointsProvides a compelling, evidence-based argument comparing datasets, showing an advanced understanding of how variability impacts the 'real-world' experience of the community.
Proficient
3 PointsUses calculated measures (Mean vs. Median) to correctly justify which group's data is more consistent or extreme. Conclusion is based on mathematical evidence.
Developing
2 PointsComparison is present but relies more on intuition than mathematical evidence. Conclusion about which group is 'typical' is vague.
Beginning
1 PointsDoes not compare datasets or uses incorrect measures to make a comparison. Conclusion is not supported by data.
The Normal Model (HSS.ID.A.4)
Evaluates the student's ability to use the Normal Distribution to assess how data points relate to the population.Normal Modeling & Z-Scores
Modeling data using the normal distribution curve and calculating relative standing via z-scores.
Exemplary
4 PointsConstructs a perfectly labeled normal curve and calculates z-scores with 100% accuracy. Provides a sophisticated interpretation of percentiles in the context of the community.
Proficient
3 PointsConstructs a normal curve labeled with mean and standard deviations. Correctly calculates z-scores for three data points and finds corresponding percentiles.
Developing
2 PointsNormal curve is drawn but may be mislabeled. Z-score calculations contain errors or use the formula incorrectly. Percentile interpretation is limited.
Beginning
1 PointsFails to model data as a normal distribution. Z-score calculations are missing or fundamentally misunderstood.
Interpretation of Relative Standing
Interpreting 'typicality' and 'extremity' within a real-world context using statistical benchmarks.
Exemplary
4 PointsEvaluates 'typicality' with nuanced insight, considering the limitations of the normal model for the specific dataset provided.
Proficient
3 PointsAccurately categorizes data points as 'Average,' 'Unusual,' or 'Extreme' based on calculated z-scores and normal distribution properties.
Developing
2 PointsCategorizes data points but the reasoning is not consistently tied to z-score values or standard deviations.
Beginning
1 PointsCannot determine if a data point is 'rare' or 'typical' using statistical evidence.
Ethical Analysis & Synthesis (HSS.ID.B.5, HSS.ID.C.9)
Assesses the final synthesis of data, including categorical associations and the ethical communication of findings.Categorical Trend Analysis
Constructing and interpreting two-way frequency tables to find associations between variables.
Exemplary
4 PointsConstructs a flawless two-way table and provides a deep analysis of conditional frequencies that reveals hidden associations or trends within the community data.
Proficient
3 PointsCorrectly constructs a two-way frequency table and calculates marginal and conditional relative frequencies to identify a clear trend.
Developing
2 PointsTwo-way table is present but contains errors in frequency counts or relative frequency calculations. Identification of trends is weak.
Beginning
1 PointsTwo-way table is missing or incorrectly structured. Fails to identify any associations between variables.
Ethical Data Communication
Communicating findings ethically and identifying potential for data misrepresentation.
Exemplary
4 PointsMasterfully synthesizes data into a professional report. Explicitly identifies multiple ways the data could be manipulated and explains how their own choices ensured an ethical 'Real Story.'
Proficient
3 PointsCommunicates findings clearly through a multimedia report. Identifies at least one potential misrepresentation and provides an evidence-based conclusion.
Developing
2 PointsReport is presented but lacks a clear connection between evidence and conclusions. Recognition of ethical pitfalls is superficial.
Beginning
1 PointsFinal brief is incomplete or presents data in a misleading way. No mention of ethical considerations or potential misrepresentation.