Why Fine-tuning Beats Prompting for Satellite AI
Everyone reaches for prompt engineering first. For general language tasks, that's often enough. For satellite imagery, it almost never is — and here's why the gap exists.
Most satellite AI projects stall between research and reality — models that work in notebooks but never reach production. I focus on closing that gap: taking a 300M parameter foundation model, adapting it to a real earth observation task, and delivering 4x the baseline performance with docs, inference code, and a pipeline anyone can reproduce.
I work across the full ML stack — satellite vision, Gen AI, RAG pipelines, agentic workflows — because most hard problems don't fit neatly into one discipline. The goal is always a working system, not just a trained model.
At Godel Space I built production satellite computer vision pipelines for disaster monitoring. At Unify I solved LLM provider fragmentation with a modular integration framework. Won 1st place at an edge AI hackathon by deploying an optimized model at 4x the speed with 85% of the accuracy.

Each of these maps to a real challenge I've worked through — from raw data to a running system.
Pre-trained models don't understand satellite data out of the box. I fine-tune foundation models (Prithvi, AnySat) on domain-specific imagery to make them actually useful for detection, segmentation, and monitoring.
Most ML work stops at the notebook. I build the full loop — data collection, preprocessing, training, evaluation, and deployment — so the model actually runs in production.
Satellite data is noisy, multi-spectral, and massive. I handle acquisition and processing at scale — multi-sensor fusion, temporal stacking, spectral analysis — so models have clean inputs.
A model no one else can run solves nothing. I ship with proper docs, inference APIs, containerized environments, and model versioning so teams can actually use and maintain what I build.
Off-the-shelf LLMs hallucinate on domain data. I solve that through fine-tuning, RAG pipelines, and agentic workflows — building products that are reliable in the real use case, not just demos.
AI systems need reliable APIs and data infrastructure around them. I build the backend layer — APIs, data processing, databases — that holds the whole thing together.
Each project started with a specific gap or failure mode. Here's what the problem was and what came out of it.
Flood mapping from satellite imagery is critical for disaster response, but pre-trained models perform poorly out of the box. Fine-tuned Prithvi EO-2.0 on Sen1Floods11 to produce accurate, production-ready flood segmentation.
Wildfire burn scar mapping requires understanding temporal change, not just a snapshot. Built an end-to-end pipeline with a novel Delta Channel Algorithm to explicitly encode pre/post-fire spectral differences, significantly improving classification accuracy.
Optical imagery alone fails in cloudy conditions — a real constraint for flood detection. Implemented AnySat (CVPR 2025) to fuse SAR and optical data from Sentinel-1 and Sentinel-2, enabling all-weather flood segmentation.
Godel Space
Problem: satellite imagery for disaster monitoring requires multi-sensor fusion at scale — a hard data engineering and ML challenge. Built the full pipeline from raw Sentinel data to deployed inference.
Delivered production-ready multi-sensor satellite inference system for disaster monitoring.
Unify
Problem: teams building with LLMs were locked into single providers, making cost/quality trade-offs impossible. Built a modular integration layer so switching providers or running A/B tests required minimal code changes.
Delivered production-ready LLM framework enabling instant provider switching and A/B testing.
Edge Runners
The challenge: deploy a capable AI model on hardware with strict memory and compute limits. The solution: fine-tuned Phi-3 with quantization and pruning, hitting 85% of full-model accuracy at 4x the speed — good enough to be genuinely useful at the edge.
B.Tech Information Technology
Haldia Institute of Technology · 2020-2024
Python Programming Essentials
Rice University
A Crash Course in Data Science
Johns Hopkins University
Complete SQL Bootcamp
Udemy
Notes on satellite AI, edge deployment, and things that broke before they worked.
Everyone reaches for prompt engineering first. For general language tasks, that's often enough. For satellite imagery, it almost never is — and here's why the gap exists.
trtexec accepts --memPoolSize=workspace:512MiB without complaint, then builds with ~1KB of workspace. No error. No warning. Here's what's actually happening and how to check if it bit you.
If you have a hard problem in satellite AI, earth observation, or ML systems — or a role where that kind of thinking is useful — I'd love to hear about it.