← Back to Projects

Satellite ML – Opium Poppy Detection System

Problem

Government agencies needed a reliable way to identify illegal opium poppy cultivation across vast territories. Manual monitoring of satellite imagery was prohibitively slow and error-prone.

Context / Business Need

The project required an automated ML pipeline capable of ingesting high-resolution satellite images, detecting opium poppy fields and producing geospatial outputs for field teams. Results needed to be accurate, fast and scalable.

Constraints

  • Handling very large images with efficient tiling and memory management.
  • Class imbalance between poppy fields and other crops or terrain.
  • Geospatial precision for actionable coordinates.
  • Integration with government GIS systems and databases.

My Role

As the ML developer, I designed the end-to-end pipeline: data cleaning, preprocessing, model training, inference and geospatial post-processing. I coordinated with geospatial analysts and field agents to ensure outputs met operational requirements.

System-Thinking Approach

We treated the ML workflow as part of a larger system, from satellite image capture through preprocessing, classification, post-processing and reporting. Each stage was designed to feed the next, with clear contracts and error handling.

MVP Design

The MVP implemented a binary classifier using a custom CNN trained on a labelled dataset of satellite tiles. The pipeline included data augmentation, class balancing strategies and a tiling mechanism to break large scenes into smaller patches.

Architecture Breakdown

  • An ingestion module to handle large images and metadata.
  • A tiling engine creating overlapping tiles with geospatial indexes.
  • A preprocessing pipeline for normalisation and augmentation.
  • A CNN model trained specifically for poppy field detection.
  • A post-processing module to merge tile predictions, remove noise and generate shapefiles.

Final Solution & Results

The final system processed 10 km × 10 km scenes in under five minutes and delivered an F1 score of 0.87. It provided precise coordinates for suspicious fields, enabling authorities to focus investigations and reduce manual workload.

Tech Stack

  • Python with PyTorch
  • OpenCV, GDAL for image processing
  • Custom CNN architecture for classification
  • GeoPandas and QGIS for geospatial outputs
  • Docker for reproducible deployments