Satellite ML – Opium Poppy Detection System
Problem
Government agencies needed a reliable way to identify illegal opium poppy cultivation across vast territories. Manual monitoring of satellite imagery was prohibitively slow and error-prone.
Context / Business Need
The project required an automated ML pipeline capable of ingesting high-resolution satellite images, detecting opium poppy fields and producing geospatial outputs for field teams. Results needed to be accurate, fast and scalable.
Constraints
- Handling very large images with efficient tiling and memory management.
- Class imbalance between poppy fields and other crops or terrain.
- Geospatial precision for actionable coordinates.
- Integration with government GIS systems and databases.
My Role
As the ML developer, I designed the end-to-end pipeline: data cleaning, preprocessing, model training, inference and geospatial post-processing. I coordinated with geospatial analysts and field agents to ensure outputs met operational requirements.
System-Thinking Approach
We treated the ML workflow as part of a larger system, from satellite image capture through preprocessing, classification, post-processing and reporting. Each stage was designed to feed the next, with clear contracts and error handling.
MVP Design
The MVP implemented a binary classifier using a custom CNN trained on a labelled dataset of satellite tiles. The pipeline included data augmentation, class balancing strategies and a tiling mechanism to break large scenes into smaller patches.
Architecture Breakdown
- An ingestion module to handle large images and metadata.
- A tiling engine creating overlapping tiles with geospatial indexes.
- A preprocessing pipeline for normalisation and augmentation.
- A CNN model trained specifically for poppy field detection.
- A post-processing module to merge tile predictions, remove noise and generate shapefiles.
Final Solution & Results
The final system processed 10 km × 10 km scenes in under five minutes and delivered an F1 score of 0.87. It provided precise coordinates for suspicious fields, enabling authorities to focus investigations and reduce manual workload.
Tech Stack
- Python with PyTorch
- OpenCV, GDAL for image processing
- Custom CNN architecture for classification
- GeoPandas and QGIS for geospatial outputs
- Docker for reproducible deployments