Comet Hunter is a deterministic ingestion and processing pipeline for LASCO image streams, designed to enable reliable visualization of faint moving sungrazing comets.
The project focuses on correctness, restartability, and explicit state modeling.
Target outcome:
A structured backend system capable of ingesting, processing, and serving time-ordered frames for scientific inspection.
Present challenges:
- Images are required to be processed before they become usable
- Sungrazer comets are faint and often indistinguishable in single frame
- Chronological playback significantly improves detectability
- The citizen scientist community is large and highly active
- Most comets are reported within minutes of data availability.
- Time is critical.
The problem is not merely detection - it is rapid detection.
This requires automation with:
- Restartable ingestion
- Idempotent file acquisition
- State-consistent processing
- Time-indexed retrieval
This project approaches the problem as a reliability-focused systems design exercise.
- Initialize database
- Sync slots
- Sync metadata
- Trigger download
- Process files (in progress)
- Visualization of processed files (planned)
(Commands and example script coming soon...)
- Slot Modeling
- File Metadata Ingestion
- File Discovery
- File Download
- File Processing
- Time-Indexed Retrieval
- Visualization Layer
Each stage is independently restartable and governed by explicit state transitions.
Implemented:
- Domain entities
- Repository abstraction (SQLite)
- Deterministic schema bootstrap
- Enum-based finite state transitions
- Indexed temporal access patterns
- Metadata ingestion
- Download orchestration
In progress:
- Processing pipeline
- Implementing image processing algorithm
Planned:
- REST retrieval API
- Interactive chronological UI
- Domain-first modeling
- Idempotent pipeline semantics (using file names as primary key, retry counters, state transitions)
- Explicit state transitions (no implicit flags)
- Per-query transactional boundaries
- Strict separation of DB writes from I/O
- Deterministic initialization
- Indexed time-series access
- Failure Handling (retry limits, failure states)
A fully restartable ingestion-to-visualization pipeline capable of surfacing at least one previously undetected comet.