Astro-Python: Exoplanet Discovery through Light Curve Analysis
Created byAbdullah Abdullah
49 views0 downloads

Astro-Python: Exoplanet Discovery through Light Curve Analysis

College/UniversityScience15 days
In this university-level project, students architect a Python-based diagnostic tool to identify exoplanets by analyzing stellar light curves. Learners progress from mastering foundational programming syntax to developing a complex data pipeline that distinguishes true planetary transits from stellar noise through mathematical modeling and iterative debugging. The experience culminates in the creation of a professional-grade discovery portfolio where students visualize astronomical data and calculate the habitability of distant solar systems.
Python ProgrammingExoplanet DetectionData ScienceLight Curve AnalysisComputational ModelingSignal ProcessingHabitability Audit
Want to create your own PBL Recipe?Use our AI-powered tools to design engaging project-based learning experiences for your students.
πŸ“

Inquiry Framework

Question Framework

Driving Question

The overarching question that guides the entire project.How can we architect a Python-based diagnostic tool to decode the light signatures of distant stars, distinguishing true planetary transits from stellar noise through mathematical modeling and iterative data analysis?

Essential Questions

Supporting questions that break down major concepts.
  • How can we translate the physical phenomenon of a planetary transit into a mathematical model that a computer can process?
  • What are the foundational building blocks of Python (variables, loops, and conditionals) required to parse and analyze large astronomical datasets?
  • How do we distinguish between 'noise' (stellar activity) and 'signal' (a true exoplanet) through algorithmic logic?
  • In what ways does data visualization (using libraries like Matplotlib) allow us to identify patterns that are invisible in raw numerical data?
  • How does the iterative process of debugging code mirror the scientific method when testing hypotheses about celestial bodies?

Standards & Learning Goals

Learning Goals

By the end of this project, students will be able to:
  • Master foundational Python programming syntax, including variables, data types, loops, and conditional logic, to manipulate astronomical datasets.
  • Develop a data processing pipeline that reads and cleans raw stellar flux data to prepare it for algorithmic analysis.
  • Construct mathematical models of planetary transits and implement them in code to differentiate between periodic dips and random stellar noise.
  • Utilize the Matplotlib library to create clear, informative visualizations of light curves that highlight detected transit events.
  • Apply computational thinking and iterative debugging strategies to refine the accuracy of the exoplanet detection algorithm.

CSTA K-12 Computer Science Standards

CSTA 3B-AP-14
Primary
Construct solutions to problems using student-created components, such as procedures, modules and/or objects. (Level 3B: Computer Science Concepts and Practices)Reason: The project requires students to build a diagnostic tool from scratch using Python, moving from basic syntax to a functional analysis module.
CSTA 3B-DA-06
Secondary
Create and use a computational model or simulation of a phenomenon, designed device, process, or system.Reason: The project involves translating the physical phenomenon of a planetary transit into a mathematical/computational model.

ACM/IEEE Computer Science Curricula

ACM/IEEE CS1.1
Primary
Fundamental Programming Concepts: Students will apply variables, expressions, loops, and conditionals to solve computational problems.Reason: Since the project teaches coding from scratch at the university level, these core programming fundamentals are the primary academic focus.

Next Generation Science Standards (NGSS)

NGSS SEP-4
Secondary
Analyze data using tools, technologies, and/or models (e.g., computational, mathematical) in order to make valid and reliable scientific claims or determine an optimal design solution.Reason: Students are using Python as a computational tool to analyze light curves and make scientific claims about the existence of exoplanets.

Common Core State Standards for Mathematical Practice

CCSS.MATH.PRACTICE.MP4
Supporting
Model with mathematics. Mathematically proficient students can apply the mathematics they know to solve problems arising in everyday life, society, and the workplace.Reason: The project uses mathematical modeling to represent the flux of a star and the geometry of a transit to detect signal patterns.

Entry Events

Events that will be used to introduce the project to students

The 'Earth 2.0' Recruitment Drive

The event begins with a debate on the Fermi Paradox ('Where is everybody?'), followed by a demonstration of the sheer volume of unanalyzed space data. Students are told they will no longer be spectators of science but will write the foundational code necessary to join the global hunt for Earth 2.0, turning their laptops into professional-grade observatories.

Interstellar Real Estate: The Habitability Audit

Students act as lead data scientists for an interstellar colonization firm that needs to verify 'Real Estate' candidates. They must develop Python scripts to calculate the size, orbital period, and temperature of potential planets, ultimately pitching which 'world' justifies a multi-billion dollar exploration mission based on their data.
πŸ“š

Portfolio Activities

Portfolio Activities

These activities progressively build towards your learning goals, with each submission contributing to the student's final portfolio.
Activity 1

Mission Briefing: The Stellar Ledger

In this introductory activity, students set up their Python environment (such as Jupyter Notebooks or VS Code) and learn to store astronomical data using variables. They will calculate a star's basic properties, such as luminosity and habitable zone boundaries, using Python as a high-powered calculator. This establishes the foundation for handling the numerical data found in light curves.

Steps

Here is some basic scaffolding to help students complete the activity.
1. Set up the Python environment and verify the installation by printing a 'System Check: Ready for Deep Space Analysis' message.
2. Define variables using appropriate data types (Strings for star names, Floats for mass and temperature) based on a provided 'Star Catalog' sheet.
3. Implement mathematical expressions to calculate Stellar Luminosity using the Stefan-Boltzmann law (simplified for relative solar units).
4. Use formatted string literals (f-strings) to print a 'Stellar Profile Report' summarizing the star's data for the 'Interstellar Real Estate' firm.

Final Product

What students will submit as the final product of the activityA Python script ('stellar_profile.py') that defines a star's characteristics (name, mass, radius, temperature) and outputs its calculated luminosity and habitable zone distances.

Alignment

How this activity aligns with the learning objectives & standardsPrimary Alignment: ACM/IEEE CS1.1 (Variables and Expressions). This activity introduces the basic syntax needed to store and manipulate scientific data, moving from conceptual stellar properties to computational variables.
Activity 2

The Flux Filter: Logic in the Light

Students move from static variables to dynamic data processing. They will learn to use 'for' loops to iterate through a list of stellar flux (brightness) values and 'if/else' statements to flag moments where the brightness drops, signifying a potential planetary transit. This simulates the automated scanning of thousands of data points.

Steps

Here is some basic scaffolding to help students complete the activity.
1. Create a Python List containing a sequence of flux values (simulating a star's brightness over time).
2. Write a 'for' loop to traverse the list, accessing each individual flux measurement.
3. Develop a conditional 'if' statement that compares the current flux against a 'threshold value' (e.g., if flux < 0.99).
4. Implement a counter variable to track how many 'dips' were found and print a summary of the findings.

Final Product

What students will submit as the final product of the activityA 'Transit Detector' script that processes a list of 100+ flux values and prints the specific timestamps where a potential planet was detected.

Alignment

How this activity aligns with the learning objectives & standardsPrimary Alignment: ACM/IEEE CS1.1 (Loops and Conditionals) and CSTA 3B-AP-14 (Constructing solutions). Students use control flow to automate the inspection of data points, a core programming skill.
Activity 3

The Data Cartographer: Visualizing Distant Suns

Raw numbers can be misleading. In this activity, students learn to use the Matplotlib library to transform their numerical lists into visual light curves. They will learn to customize plots to distinguish between 'noise' (random fluctuations) and the distinct 'U-shape' of a planetary transit, a critical skill for any data scientist.

Steps

Here is some basic scaffolding to help students complete the activity.
1. Import the Matplotlib library and prepare 'Time' (x-axis) and 'Flux' (y-axis) data arrays.
2. Generate a scatter plot and a line plot of the stellar data to observe the difference in data clarity.
3. Customize the visualization with labels (Time in Days, Relative Flux), gridlines, and a specific color palette for readability.
4. Identify a 'transit dip' visually and use code to draw a vertical span or arrow highlighting the discovery on the plot.

Final Product

What students will submit as the final product of the activityA professionally formatted Light Curve Plot (PNG or PDF) with labeled axes, a title, and annotated transit events.

Alignment

How this activity aligns with the learning objectives & standardsPrimary Alignment: NGSS SEP-4 (Analyzing and interpreting data) and CSTA 3B-DA-06 (Computational modeling). This aligns with the visual nature of astronomical discovery, where patterns are identified through graphical analysis.
Activity 4

The Transit Architect: Building the Diagnostic Tool

Students will now create Python functions to model the geometry of a transit. They will write code that takes the 'depth' of a light curve dip and calculates the radius of the exoplanet relative to its host star. This activity teaches students how to encapsulate complex logic into reusable 'tools' that can be applied to any star system.

Steps

Here is some basic scaffolding to help students complete the activity.
1. Define a Python function that accepts 'transit depth' and 'stellar radius' as arguments.
2. Incorporate the mathematical formula (Radius_p = Radius_s * sqrt(depth)) into the function's return statement.
3. Test the function with different 'test cases' to ensure it handles various star types correctly (debugging).
4. Add 'docstrings' to the function to explain what the calculation does, following professional coding documentation standards.

Final Product

What students will submit as the final product of the activityA custom Python module containing functions like 'calculate_planet_radius()' and 'estimate_orbital_period()' to be used in the final project pipeline.

Alignment

How this activity aligns with the learning objectives & standardsPrimary Alignment: CCSS.MATH.PRACTICE.MP4 (Modeling with mathematics) and CSTA 3B-AP-14 (Procedures/Modules). This focuses on the transition from simple scripts to reusable, functional code.
Activity 5

The Earth 2.0 Pipeline: From Code to Discovery

In the final portfolio activity, students integrate their previous scripts into a complete Data Processing Pipeline. They will read a real CSV dataset (simulated Kepler data), clean the data by removing 'NaN' (Not a Number) values, run their transit-detection logic, and generate a final 'Habitability Audit' report for their chosen planet candidate.

Steps

Here is some basic scaffolding to help students complete the activity.
1. Use Python's 'csv' or 'pandas' module to load a large astronomical dataset from an external file.
2. Implement a data cleaning step to handle missing values or extreme stellar noise using conditional filters.
3. Apply the 'Transit Architect' functions to the cleaned data to derive the physical properties of the detected planet.
4. Compare the results against 'Earth-like' parameters to determine if the planet falls within the habitable zone.
5. Perform a final 'code review' to optimize the algorithm's efficiency before submitting the 'Earth 2.0' pitch.

Final Product

What students will submit as the final product of the activityA comprehensive 'Earth 2.0 Discovery Portfolio' consisting of the final Python pipeline code, a visualized light curve, and a written justification of the planet's potential habitability.

Alignment

How this activity aligns with the learning objectives & standardsPrimary Alignment: CSTA 3B-AP-14 (Constructing solutions) and NGSS SEP-4 (Scientific claims from models). This represents the culmination of all previous skills into a unified scientific workflow.
πŸ†

Rubric & Reflection

Portfolio Rubric

Grading criteria for assessing the overall project portfolio

Astro-Python: Exoplanet Discovery Rubric

Category 1

Programming Fundamentals

Core computer science concepts including syntax, control flow, and algorithmic implementation.
Criterion 1

Pythonic Foundations & Logic

Evaluates the student's ability to use variables, data types, loops, and conditional statements to process astronomical data.

Exemplary
4 Points

Demonstrates masterly use of Python syntax; control flow (loops/conditionals) is optimized for performance; variables are named using professional naming conventions (snake_case); utilizes advanced features like list comprehensions or nested logic with total accuracy.

Proficient
3 Points

Correctly implements variables, for-loops to iterate through flux data, and conditional statements to flag transit events; code is functional and largely follows standard style guidelines.

Developing
2 Points

Basic loops and conditionals are present but may contain logic errors (e.g., off-by-one errors) or inconsistent variable naming; code requires some manual intervention to run correctly.

Beginning
1 Points

Syntax errors prevent code from executing; demonstrates significant misunderstanding of how to use loops or if-statements to filter data.

Category 2

Scientific Modeling

The application of physics and mathematics through computational structures.
Criterion 1

Mathematical Modeling & Modular Design

Assesses the ability to translate the physical phenomenon of a planetary transit into a functional Python module.

Exemplary
4 Points

Functions are expertly designed with clear input/output parameters, comprehensive docstrings, and robust error handling; mathematical models (like transit depth to radius) are implemented with high precision.

Proficient
3 Points

Successfully encapsulates mathematical formulas into reusable functions; includes basic documentation (comments) and returns accurate calculations for planetary properties.

Developing
2 Points

Mathematical formulas are used in the script but not encapsulated in reusable functions; some calculations of luminosity or radius contain inaccuracies.

Beginning
1 Points

Formula implementation is incorrect or missing; inability to translate the physics of the project into code-based logic.

Category 3

Data Interpretation & Visualization

Transforming raw data into meaningful graphical representations for analysis.
Criterion 1

Algorithmic Visualization

Evaluates the effectiveness of data visualizations in identifying patterns and distinguishing signal from noise.

Exemplary
4 Points

Creates sophisticated, multi-layered visualizations; uses advanced Matplotlib features (e.g., subplots, annotations for transit events, customized style sheets) to make a compelling scientific case for a discovery.

Proficient
3 Points

Produces clear, well-labeled light curves with appropriate axes and titles; visualizes the 'U-shape' of a transit effectively to support data claims.

Developing
2 Points

Generates basic plots but lacks essential labeling (units, axes names); visualizations are cluttered or do not clearly highlight the transit event.

Beginning
1 Points

Visualizations are missing, incorrect, or fail to represent the relationship between time and flux.

Category 4

Engineering & Analysis

Managing the lifecycle of data from ingestion to final reporting.
Criterion 1

Data Cleaning & Pipeline Integration

Assesses the ability to handle real-world scientific data, including loading CSV files and cleaning noise.

Exemplary
4 Points

Implements advanced data cleaning strategies to handle outliers and missing 'NaN' values; uses Pandas or specialized modules to efficiently manage large-scale astronomical datasets.

Proficient
3 Points

Successfully imports data from external files; applies filters to remove obvious stellar noise and prepares the dataset for the analysis pipeline.

Developing
2 Points

Attempts to load external data but struggles with file paths or data formats; data cleaning is minimal, leading to 'noisy' results in the detection phase.

Beginning
1 Points

Inability to load or process data from external sources; pipeline fails at the ingestion stage.

Category 5

Final Project Synthesis

The culmination of technical skills into a coherent scientific finding.
Criterion 1

Scientific Discovery & Synthesis

Evaluates the final 'Earth 2.0' pitch based on the synthesis of code, data, and scientific reasoning.

Exemplary
4 Points

Presents a professional-grade audit that synthesizes complex data analysis with a compelling habitability argument; demonstrates iterative debugging and optimization of the discovery code.

Proficient
3 Points

Constructs a logical argument for a planet's habitability based on the code's output; final pipeline is complete and produces a valid discovery report.

Developing
2 Points

The final report is complete but the reasoning for habitability is weakly supported by the data or contains logical gaps.

Beginning
1 Points

The final product is incomplete or the habitability justification is not grounded in the data produced by the student's code.

Reflection Prompts

End-of-project reflection questions to get students to think about their learning
Question 1

How did the process of 'debugging' your exoplanet detection algorithm change your understanding of the scientific method?

Text
Required
Question 2

Rate your level of confidence in your ability to translate a complex physical phenomenon into a functional computational model.

Scale
Required
Question 3

Which component of the project most shifted your perspective on how 'Big Data' is used in modern astronomy?

Multiple choice
Required
Options
Syntax & Structure: Mastering Python variables, loops, and functions.
Data Literacy: Cleaning raw CSV data and handling 'NaN' values or noise.
Mathematical Modeling: Applying laws like Stefan-Boltzmann to code logic.
Visual Analysis: Using Matplotlib to see patterns invisible in raw numbers.
Question 4

In the 'Earth 2.0' recruitment scenario, what are the risks of a 'False Positive' versus a 'False Negative' in your detection script, and how did you balance this in your code?

Text
Optional
Question 5

How do you plan to apply the computational thinking and data pipeline strategies developed in this project to your future coursework or professional career?

Text
Required