
Astro-Python: Exoplanet Discovery through Light Curve Analysis
Inquiry Framework
Question Framework
Driving Question
The overarching question that guides the entire project.How can we architect a Python-based diagnostic tool to decode the light signatures of distant stars, distinguishing true planetary transits from stellar noise through mathematical modeling and iterative data analysis?Essential Questions
Supporting questions that break down major concepts.- How can we translate the physical phenomenon of a planetary transit into a mathematical model that a computer can process?
- What are the foundational building blocks of Python (variables, loops, and conditionals) required to parse and analyze large astronomical datasets?
- How do we distinguish between 'noise' (stellar activity) and 'signal' (a true exoplanet) through algorithmic logic?
- In what ways does data visualization (using libraries like Matplotlib) allow us to identify patterns that are invisible in raw numerical data?
- How does the iterative process of debugging code mirror the scientific method when testing hypotheses about celestial bodies?
Standards & Learning Goals
Learning Goals
By the end of this project, students will be able to:- Master foundational Python programming syntax, including variables, data types, loops, and conditional logic, to manipulate astronomical datasets.
- Develop a data processing pipeline that reads and cleans raw stellar flux data to prepare it for algorithmic analysis.
- Construct mathematical models of planetary transits and implement them in code to differentiate between periodic dips and random stellar noise.
- Utilize the Matplotlib library to create clear, informative visualizations of light curves that highlight detected transit events.
- Apply computational thinking and iterative debugging strategies to refine the accuracy of the exoplanet detection algorithm.
CSTA K-12 Computer Science Standards
ACM/IEEE Computer Science Curricula
Next Generation Science Standards (NGSS)
Common Core State Standards for Mathematical Practice
Entry Events
Events that will be used to introduce the project to studentsThe 'Earth 2.0' Recruitment Drive
The event begins with a debate on the Fermi Paradox ('Where is everybody?'), followed by a demonstration of the sheer volume of unanalyzed space data. Students are told they will no longer be spectators of science but will write the foundational code necessary to join the global hunt for Earth 2.0, turning their laptops into professional-grade observatories.Interstellar Real Estate: The Habitability Audit
Students act as lead data scientists for an interstellar colonization firm that needs to verify 'Real Estate' candidates. They must develop Python scripts to calculate the size, orbital period, and temperature of potential planets, ultimately pitching which 'world' justifies a multi-billion dollar exploration mission based on their data.Portfolio Activities
Portfolio Activities
These activities progressively build towards your learning goals, with each submission contributing to the student's final portfolio.Mission Briefing: The Stellar Ledger
In this introductory activity, students set up their Python environment (such as Jupyter Notebooks or VS Code) and learn to store astronomical data using variables. They will calculate a star's basic properties, such as luminosity and habitable zone boundaries, using Python as a high-powered calculator. This establishes the foundation for handling the numerical data found in light curves.Steps
Here is some basic scaffolding to help students complete the activity.Final Product
What students will submit as the final product of the activityA Python script ('stellar_profile.py') that defines a star's characteristics (name, mass, radius, temperature) and outputs its calculated luminosity and habitable zone distances.Alignment
How this activity aligns with the learning objectives & standardsPrimary Alignment: ACM/IEEE CS1.1 (Variables and Expressions). This activity introduces the basic syntax needed to store and manipulate scientific data, moving from conceptual stellar properties to computational variables.The Flux Filter: Logic in the Light
Students move from static variables to dynamic data processing. They will learn to use 'for' loops to iterate through a list of stellar flux (brightness) values and 'if/else' statements to flag moments where the brightness drops, signifying a potential planetary transit. This simulates the automated scanning of thousands of data points.Steps
Here is some basic scaffolding to help students complete the activity.Final Product
What students will submit as the final product of the activityA 'Transit Detector' script that processes a list of 100+ flux values and prints the specific timestamps where a potential planet was detected.Alignment
How this activity aligns with the learning objectives & standardsPrimary Alignment: ACM/IEEE CS1.1 (Loops and Conditionals) and CSTA 3B-AP-14 (Constructing solutions). Students use control flow to automate the inspection of data points, a core programming skill.The Data Cartographer: Visualizing Distant Suns
Raw numbers can be misleading. In this activity, students learn to use the Matplotlib library to transform their numerical lists into visual light curves. They will learn to customize plots to distinguish between 'noise' (random fluctuations) and the distinct 'U-shape' of a planetary transit, a critical skill for any data scientist.Steps
Here is some basic scaffolding to help students complete the activity.Final Product
What students will submit as the final product of the activityA professionally formatted Light Curve Plot (PNG or PDF) with labeled axes, a title, and annotated transit events.Alignment
How this activity aligns with the learning objectives & standardsPrimary Alignment: NGSS SEP-4 (Analyzing and interpreting data) and CSTA 3B-DA-06 (Computational modeling). This aligns with the visual nature of astronomical discovery, where patterns are identified through graphical analysis.The Transit Architect: Building the Diagnostic Tool
Students will now create Python functions to model the geometry of a transit. They will write code that takes the 'depth' of a light curve dip and calculates the radius of the exoplanet relative to its host star. This activity teaches students how to encapsulate complex logic into reusable 'tools' that can be applied to any star system.Steps
Here is some basic scaffolding to help students complete the activity.Final Product
What students will submit as the final product of the activityA custom Python module containing functions like 'calculate_planet_radius()' and 'estimate_orbital_period()' to be used in the final project pipeline.Alignment
How this activity aligns with the learning objectives & standardsPrimary Alignment: CCSS.MATH.PRACTICE.MP4 (Modeling with mathematics) and CSTA 3B-AP-14 (Procedures/Modules). This focuses on the transition from simple scripts to reusable, functional code.The Earth 2.0 Pipeline: From Code to Discovery
In the final portfolio activity, students integrate their previous scripts into a complete Data Processing Pipeline. They will read a real CSV dataset (simulated Kepler data), clean the data by removing 'NaN' (Not a Number) values, run their transit-detection logic, and generate a final 'Habitability Audit' report for their chosen planet candidate.Steps
Here is some basic scaffolding to help students complete the activity.Final Product
What students will submit as the final product of the activityA comprehensive 'Earth 2.0 Discovery Portfolio' consisting of the final Python pipeline code, a visualized light curve, and a written justification of the planet's potential habitability.Alignment
How this activity aligns with the learning objectives & standardsPrimary Alignment: CSTA 3B-AP-14 (Constructing solutions) and NGSS SEP-4 (Scientific claims from models). This represents the culmination of all previous skills into a unified scientific workflow.Rubric & Reflection
Portfolio Rubric
Grading criteria for assessing the overall project portfolioAstro-Python: Exoplanet Discovery Rubric
Programming Fundamentals
Core computer science concepts including syntax, control flow, and algorithmic implementation.Pythonic Foundations & Logic
Evaluates the student's ability to use variables, data types, loops, and conditional statements to process astronomical data.
Exemplary
4 PointsDemonstrates masterly use of Python syntax; control flow (loops/conditionals) is optimized for performance; variables are named using professional naming conventions (snake_case); utilizes advanced features like list comprehensions or nested logic with total accuracy.
Proficient
3 PointsCorrectly implements variables, for-loops to iterate through flux data, and conditional statements to flag transit events; code is functional and largely follows standard style guidelines.
Developing
2 PointsBasic loops and conditionals are present but may contain logic errors (e.g., off-by-one errors) or inconsistent variable naming; code requires some manual intervention to run correctly.
Beginning
1 PointsSyntax errors prevent code from executing; demonstrates significant misunderstanding of how to use loops or if-statements to filter data.
Scientific Modeling
The application of physics and mathematics through computational structures.Mathematical Modeling & Modular Design
Assesses the ability to translate the physical phenomenon of a planetary transit into a functional Python module.
Exemplary
4 PointsFunctions are expertly designed with clear input/output parameters, comprehensive docstrings, and robust error handling; mathematical models (like transit depth to radius) are implemented with high precision.
Proficient
3 PointsSuccessfully encapsulates mathematical formulas into reusable functions; includes basic documentation (comments) and returns accurate calculations for planetary properties.
Developing
2 PointsMathematical formulas are used in the script but not encapsulated in reusable functions; some calculations of luminosity or radius contain inaccuracies.
Beginning
1 PointsFormula implementation is incorrect or missing; inability to translate the physics of the project into code-based logic.
Data Interpretation & Visualization
Transforming raw data into meaningful graphical representations for analysis.Algorithmic Visualization
Evaluates the effectiveness of data visualizations in identifying patterns and distinguishing signal from noise.
Exemplary
4 PointsCreates sophisticated, multi-layered visualizations; uses advanced Matplotlib features (e.g., subplots, annotations for transit events, customized style sheets) to make a compelling scientific case for a discovery.
Proficient
3 PointsProduces clear, well-labeled light curves with appropriate axes and titles; visualizes the 'U-shape' of a transit effectively to support data claims.
Developing
2 PointsGenerates basic plots but lacks essential labeling (units, axes names); visualizations are cluttered or do not clearly highlight the transit event.
Beginning
1 PointsVisualizations are missing, incorrect, or fail to represent the relationship between time and flux.
Engineering & Analysis
Managing the lifecycle of data from ingestion to final reporting.Data Cleaning & Pipeline Integration
Assesses the ability to handle real-world scientific data, including loading CSV files and cleaning noise.
Exemplary
4 PointsImplements advanced data cleaning strategies to handle outliers and missing 'NaN' values; uses Pandas or specialized modules to efficiently manage large-scale astronomical datasets.
Proficient
3 PointsSuccessfully imports data from external files; applies filters to remove obvious stellar noise and prepares the dataset for the analysis pipeline.
Developing
2 PointsAttempts to load external data but struggles with file paths or data formats; data cleaning is minimal, leading to 'noisy' results in the detection phase.
Beginning
1 PointsInability to load or process data from external sources; pipeline fails at the ingestion stage.
Final Project Synthesis
The culmination of technical skills into a coherent scientific finding.Scientific Discovery & Synthesis
Evaluates the final 'Earth 2.0' pitch based on the synthesis of code, data, and scientific reasoning.
Exemplary
4 PointsPresents a professional-grade audit that synthesizes complex data analysis with a compelling habitability argument; demonstrates iterative debugging and optimization of the discovery code.
Proficient
3 PointsConstructs a logical argument for a planet's habitability based on the code's output; final pipeline is complete and produces a valid discovery report.
Developing
2 PointsThe final report is complete but the reasoning for habitability is weakly supported by the data or contains logical gaps.
Beginning
1 PointsThe final product is incomplete or the habitability justification is not grounded in the data produced by the student's code.