Satellite ML – Case Study

Problem

Government agencies needed a reliable way to identify illegal opium poppy cultivation across vast territories. Manual monitoring of satellite imagery was prohibitively slow and error-prone.

Context / Business Need

The project required an automated ML pipeline capable of ingesting high-resolution satellite images, detecting opium poppy fields and producing geospatial outputs for field teams. Results needed to be accurate, fast and scalable.

Constraints

Handling very large images with efficient tiling and memory management.
Class imbalance between poppy fields and other crops or terrain.
Geospatial precision for actionable coordinates.
Integration with government GIS systems and databases.

My Role

As the ML developer, I designed the end-to-end pipeline: data cleaning, preprocessing, model training, inference and geospatial post-processing. I coordinated with geospatial analysts and field agents to ensure outputs met operational requirements.

System-Thinking Approach

We treated the ML workflow as part of a larger system, from satellite image capture through preprocessing, classification, post-processing and reporting. Each stage was designed to feed the next, with clear contracts and error handling.

MVP Design

The MVP implemented a binary classifier using a custom CNN trained on a labelled dataset of satellite tiles. The pipeline included data augmentation, class balancing strategies and a tiling mechanism to break large scenes into smaller patches.

Architecture Breakdown

An ingestion module to handle large images and metadata.
A tiling engine creating overlapping tiles with geospatial indexes.
A preprocessing pipeline for normalisation and augmentation.
A CNN model trained specifically for poppy field detection.
A post-processing module to merge tile predictions, remove noise and generate shapefiles.

Final Solution & Results

The final system processed 10 km × 10 km scenes in under five minutes and delivered an F1 score of 0.87. It provided precise coordinates for suspicious fields, enabling authorities to focus investigations and reduce manual workload.

Tech Stack

Python with PyTorch
OpenCV, GDAL for image processing
Custom CNN architecture for classification
GeoPandas and QGIS for geospatial outputs
Docker for reproducible deployments

Satellite ML – Opium Poppy Detection System