AI-Powered Crop Yield Prediction from Satellite Imagery for Food Security — Omdena Case Study
Discover how AI and satellite imagery improved crop yield prediction accuracy by 59% using CNN-LSTM and SARIMA models.

Executive Summary
Accurate crop yield forecasting is critical for strengthening food security and improving agricultural planning at scale. In Senegal, timely knowledge of what crops are growing and how much yield to expect remains essential for decision-making across the agricultural sector. This project, built by a team of 30 AI engineers in partnership with the Global Partnership for Sustainable Development Data, constructed an end-to-end satellite intelligence system capable of predicting crop yields across all fourteen Senegalese departments for three staple crops: Maize, Millet, and Rice. The system runs entirely on freely available satellite data and open source tools, eliminating the need for costly on-the-ground surveys.
Two parallel prediction pipelines were developed. The first uses a CNN LSTM deep learning architecture trained on spectral and temperature histograms derived from MODIS satellite imagery, with transfer learning from South Sudan and Ethiopia, reducing test set MSE for Maize from 0.2094 to 0.0849, a 59 percent improvement over training on Senegalese data alone. The second uses SARIMA time series forecasting of NDVI signals to produce yield estimates before the harvest season ends, achieving an absolute error of less than 25 percent for most Senegalese regions in the 2018 Sorghum validation year. Together, they deliver georeferenced yield estimates scalable to any district with satellite coverage.
The Yield Data Gap: Why Senegal’s Food Security Needs Better Intelligence
Senegal depends heavily on smallholder agriculture, with Maize, Millet, and Rice accounting for the bulk of domestic food production across markedly different agroclimatic zones — from the dense groundnut belt of the west to the irrigated river plains of the north and the mixed crop systems of the south. Reliable crop yield data is essential for everything from national food security planning to targeted subsidy distribution, import timing, and climate adaptation policy. Yet the dominant source of that data in Senegal and across much of Sub-Saharan Africa remains farmer self-reporting through seasonal surveys: a method that is slow, expensive, geographically uneven, and difficult to verify at scale.
Survey-based yield estimates are collected months after the harvest season ends, making them useless for in-season intervention. They suffer from response bias, sampling gaps in remote agricultural areas, and inconsistent collection methods across departments. For policymakers who need to act before food shortages compound into crises, the data arrives too late and with too little geographic resolution to be operationally useful.
Satellite imagery changes the calculus entirely. Satellites continuously observe the entire national territory at no incremental cost per hectare, producing spectral signals that reliably correlate with vegetation health and crop biomass. The technical challenge is not data access but building the pipeline that translates raw spectral observations into calibrated yield estimates tied to specific crop types, specific departments, and specific seasons. That is what this project set out to build.
Satellite Imagery and the Phenological Signal Behind Every Prediction
The project drew on two primary satellite datasets, both accessed through Google Earth Engine. Surface reflectance came from MOD09A1.006 Terra, a MODIS product delivering eight-day composite imagery at 500-metre resolution across seven spectral bands. Land surface temperature came from MYD11A2.006 Aqua, providing eight-day thermal observations at one-kilometre resolution. Together, they capture two complementary signals: the greenness trajectory of vegetation and the thermal environment in which crops develop.
The normalized difference vegetation index, NDVI, is the central derived signal throughout the project. Computed from near-infrared and red reflectance bands, NDVI tracks vegetation density and photosynthetic activity across time. Its value for yield prediction lies not in any single observation but in the time series it produces across a growing season. Different crops have distinct phenological signatures: they emerge, grow, peak, and senesce at different rates and timings. NDVI time-series data encodes these signatures, allowing a model trained on historical yield records to link spectral trajectories to expected tonnes per hectare.
Aligning satellite observations with the correct growing windows required using Senegal’s FAO crop calendar, which varies by crop type. For Maize, the relevant window runs approximately weeks 19 to 30; Millet and Rice follow different schedules. Using imagery outside the growing season introduces noise without a predictive signal. All datasets were filtered to the relevant growing period before any feature extraction or model training, ensuring that the spectral features used to train the models encode crop development rather than fallow or post-harvest conditions.

Figure 1: Agricultural production zones across Senegal. Crop types and growing systems vary significantly by region — from the groundnut basin of the centre-west to rice cultivation along the northern river plains — shaping which satellite signals are most predictive for each district.
Ground Truth Scarcity and the Data Alignment Challenge
The most consequential constraint this project faced was not computational but epistemic: the available ground truth for Senegal was severely limited. The primary source was the IPAR dataset, which provided GPS-located yield records for multiple crops but only for the year 2014. A deep learning model trained on a single year of sparse, geographically concentrated observations has no reliable way to distinguish genuine crop-yield relationships from historical coincidences of that particular growing season.
Land cover masking introduced a second challenge. Standard practice is to apply a cropland mask to satellite imagery, removing non-agricultural pixels before training. The most commonly used dataset for this — the MCD12Q1.006 MODIS Land Cover product — proved unreliable for Senegal: its cropland classification did not match the spatial distribution of crops in the IPAR ground-truth records, which were concentrated in the south-west and northern regions. The team identified this mismatch through direct comparison and switched to the Copernicus Global Land Cover dataset, which aligned significantly better despite being available only from 2015 onward. This required assuming 2015 land cover as a valid proxy for 2014 — a documented limitation the team handled explicitly rather than silently.
To address the scarcity of IPAR data, two strategies were implemented in parallel. For the deep learning pipeline, transfer learning yielded records from South Sudan and Ethiopia, providing the model with a pre-training signal before Senegalese fine-tuning. Data augmentation via sliding spatial windows around each GPS point expanded the training set by assuming neighbouring pixels shared similar yield conditions. For the time-series pipeline, a separate regional yield dataset from OpenDataForAfrica was used, covering 2016 to 2019 and providing greater temporal depth. Each strategy addressed data scarcity differently, and neither fully resolved it — a constraint the team acknowledged and built around rather than obscured.

Figure 2: IPAR ground truth GPS locations across Senegal for Maize, Millet, Rice, and other crops. Ground truth points are geographically concentrated, with limited coverage across the north and east — directly shaping the model’s capacity to generalise to undersampled regions.

Figure 3: Side-by-side comparison of MODIS land cover (left) and Copernicus land cover (right) for Senegal. The MODIS product misclassifies significant cropland areas; the Copernicus dataset aligns substantially better with IPAR ground truth locations.
Two Pipelines, Two Different Questions
Rather than committing to a single modelling paradigm, the project developed two independent prediction pipelines that operate on overlapping yet distinct data representations. The first is a supervised deep learning pipeline using CNN-LSTM architecture applied to spectral-thermal histograms, requiring labelled yield ground truth for training. The second is a time-series pipeline that uses SARIMA models to forecast NDVI trajectories and infer yield from the forecast signal, requiring no crop yield labels at prediction time. Running both allowed the team to assess the trade-offs between accuracy, data requirements, and operational timeliness that any real deployment would need to navigate.
Deep Learning Pipeline: CNN-LSTM with Transfer Learning
The deep learning approach was grounded in an architecture first validated for county-level soybean yield prediction: a CNN-LSTM model that processes spatial patterns through convolutional layers before capturing temporal dynamics through an LSTM sequence model. Applied to Senegal, this architecture needed to generalise across a far more data-scarce environment than its original context.
Feature Representation: 3D Histograms Over Raw Imagery
Why histograms rather than raw images? Instead of feeding raw satellite image patches into the model, the pipeline converted each satellite observation into a three-dimensional pixel-count histogram that aggregated spectral and temperature values across the crop-masked region of interest. This representation is more compact, less sensitive to exact spatial alignment, and substantially reduces the risk of overfitting on limited ground truth data. The histogram captures the statistical distribution of spectral conditions across a region, without requiring the model to learn from pixel-level spatial patterns that may not generalise across departments.
Multi-band inputs. Histograms were generated from the surface reflectance bands of MOD09A1 and the temperature bands of MYD11A2, masked to cropland pixels using the Copernicus land cover layer and aligned to the relevant growing season windows from the FAO crop calendar. Stacked across time steps within the growing season, these histograms form the temporal input sequence fed into the LSTM component of the architecture.

Figure 4: CNN-LSTM architecture. Each time step’s histogram is processed through two convolutional blocks (Conv2D, Batch Normalisation, MaxPool2D) to extract spatial features, which are then flattened and fed into an LSTM to model temporal dynamics across the 34 growing-season timestamps before a Dense layer outputs the yield prediction.
Transfer Learning and Data Augmentation
Cross-country pre-training. With only a single year of IPAR ground truth available for Senegal, the model was first pre-trained on yield records from South Sudan and Ethiopia, where comparable labelled data were available from an existing deep transfer-learning crop prediction dataset. Pre-training gave the model a generalised understanding of the relationship between spectral-thermal histograms and yield outcomes before it encountered Senegal-specific supervision. The model was then fine-tuned on IPAR Senegal data, adapting the learned representations to local crop varieties and agroclimatic conditions.
Sliding window augmentation. To expand the training set beyond the IPAR GPS points, the team applied spatial data augmentation. For each labelled location, additional training samples were generated by sliding windows around the original coordinates, assuming that yield conditions in immediately neighbouring pixels were similar to those at the ground-truth point. This is a documented technique for small-sample remote sensing problems and produced measurable improvements in training stability without requiring additional field data collection.
Results: Where Transfer Learning Made the Difference
Maize. Transfer learning produced the largest measurable improvement for Maize. The CNN-LSTM pre-trained on Ethiopia and South Sudan data and then fine-tuned on IPAR Senegal data achieved a test-set MSE of 0.0849 — compared with 0.2094 for a model trained only on IPAR 2014 data. That is a 59 percent reduction in error, driven entirely by the richer pre-training signal rather than any architectural change.
Millet. The Millet model achieved a test-set MSE of 0.1293 on IPAR data, a moderate result that did not benefit from transfer learning to the same degree as Maize. The difference likely reflects a smaller ground-truth sample for Millet and greater agroclimatic variability in Millet-growing regions than in the agroclimatic conditions captured in the pre-training data from South Sudan and Ethiopia.

Figure 5: End-to-end system pipeline from satellite data acquisition through preprocessing and dual-pipeline modelling to department-level yield maps and the interactive prediction application.
Time-Series Pipeline: SARIMA Forecasting of NDVI and Yield
The time-series pipeline was designed around a different operational question: can yield be estimated earlier in the season, before harvest data is available, by forecasting the NDVI trajectory and inferring yield from that forecast? If so, policymakers could receive preliminary estimates while the growing season is still active, enabling earlier intervention. This required two linked models: one to forecast NDVI, and one to translate the forecast NDVI into a yield estimate.

Figure 6: Time-series pipeline overview. Historical NDVI and climate data are used to train a SARIMA forecasting model that projects NDVI for the current season. A separate regression model then maps the forecast NDVI window to expected yield — enabling predictions before harvest data are available.
NDVI Forecasting with SARIMA
Data and regional scope. The NDVI input came from the ASAP dataset, which provides observations every 10 days starting in 2002, giving the model nearly two decades of historical NDVI data at the regional level. SARIMA models were trained independently for each Senegalese region to capture region-specific seasonal patterns, trend components, and error structures. The team also evaluated SARIMAX with climate covariates, as well as DNN and LSTM alternatives, finding that SARIMA provided a reliable and computationally efficient baseline.
Forecast quality. SARIMA forecasts for the holdout period captured the broad seasonal trajectory of NDVI — including the timing of the growing season peak and the rate of senescence — reasonably well for regions with moderate interannual NDVI variability. Regions with more erratic patterns, driven by irregular rainfall, were harder to forecast. The forecasted NDVI series were then fed into the second stage as synthetic inputs, replacing observed NDVI values to simulate the in-season prediction scenario.
From NDVI Forecast to Yield Estimate
Feature construction. For each region and year, a set of NDVI-derived features was computed: maximum NDVI during the season, cumulative NDVI summed across the growing window, NDVI windows corresponding to early, mid, and late growing periods, and analogous features from rainfall and temperature time series. These features served as inputs to both linear and nonlinear regression models predicting annual regional yield.
Sorghum results. For Sorghum, the linear regression model achieved an absolute error of less than 25 percent for most Senegalese regions in the 2018 validation year. The 2019 test year was harder: Sorghum yields in 2019 were anomalously high, and the model — trained on 2016 to 2018 patterns — could not predict such an extreme value. This failure illustrates the core risk of all yield forecasting from limited historical data: generalising across anomalous climate years is only possible if the training set includes comparable anomalies.
Maize time-series. Maize yield prediction using the regional time-series approach did not perform as well as the deep learning pipeline, consistent with Maize being more spatially variable and more dependent on local soil conditions, which the regional NDVI signal does not capture. This cross-method comparison was valuable precisely because it pointed to where deep learning adds genuine lift over simpler methods — and where it does not.

Figure 7: SARIMA NDVI forecast for Kaffrine region. The left panel shows the full 18-year training history and forecast window. The right panel zooms into the 2019 forecast period, showing that the model closely tracks the seasonal NDVI trajectory, validating the approach as a basis for in-season yield inference.
Results Across Three Crops and Four Years
Yield predictions were generated for Maize, Millet, and Rice across all Senegalese departments from 2015 to 2018. For Maize, the CNN-LSTM with transfer learning delivered the strongest result: a test-set MSE of 0.0849, compared to 0.2094 for the IPAR-only baseline, confirming that cross-country pre-training is the most important single lever when domestic ground truth is limited to a single year. For Sorghum, the SARIMA-based time-series approach achieved sub-25 percent error rates across most regions in 2018, making it operationally useful for in-season policy planning despite its limitations in anomalous years such as 2019.
The two pipelines are complementary rather than competing. The deep learning model is more accurate for crops with sufficient similarity to the transfer-learning source domains and produces its best estimates from full growing-season imagery. The SARIMA pipeline sacrifices some accuracy for operational earliness. Because it uses only historical NDVI series and a yield regression model, it can produce preliminary forecasts while the current season is still active. Together, they give decision-makers both an early warning signal and a more precise post-season estimate from the same underlying satellite infrastructure.
Honest assessment of the results requires acknowledging what data limitations prevent. A single year of IPAR ground truth constrains what any model can learn about Senegalese conditions. The 2019 Sorghum generalisation failure is a clear illustration: without anomalous years in the training set, the model has no way to anticipate them in production. These are not failures of methodology but of data availability — and they point directly to where future investment in ground-truth collection would yield the largest gains in prediction quality.

Figure 8: Predicted Maize yield (T/ha) for all Senegalese departments across four years (2015-2018). Each cluster of bars represents one department; colour encodes the year. The variation across departments and years reflects real agroclimatic differences captured by the CNN-LSTM model.
Model performance summary
| Model / Approach | Crop | MSE (Test Set) | Key Finding |
|---|---|---|---|
| CNN-LSTM + Transfer Learning | Maize | 0.0849 | Best result — 59% lower error than IPAR-only training |
| CNN-LSTM (IPAR 2014 only) | Maize | 0.2094 | Baseline — limited by single-year ground truth |
| CNN-LSTM (IPAR data) | Millet | 0.1293 | Moderate result — less transfer learning benefit |
| SARIMA + NDVI Regression | Sorghum | <25% abs. error | Most regions’ 2018 — 2019 anomaly was not captured |
From Model to Policy: The Interactive Prediction App
Machine learning predictions only matter if decision-makers can access them. The project addressed this by converting the CNN-LSTM prediction pipeline into an interactive web application built on Jupyter. It was delivered through Voila, a tool that transforms Jupyter notebooks into standalone, browser-accessible applications that require no Python knowledge to operate. A user can open the app, select a region of interest by drawing on an interactive map, choose a year and crop type, and receive a yield estimate without writing a single line of code.
The technical flow underneath the interface is the full prediction pipeline condensed into a guided interaction. Once a region is selected, the app retrieves the corresponding satellite imagery from Google Earth Engine, generates the three-dimensional spectral histograms, passes them through the pre-trained CNN-LSTM model, and renders the yield estimate alongside a spatial visualisation of predicted values across the selected area. An alternative input mode accepts GPS latitude and longitude coordinates directly, allowing the same prediction to be triggered from field survey records without manual map interaction.
This matters for the case study, not as a demonstration of Jupyter widget capability but as a systems design choice. Prediction models that require a data scientist to operate cannot scale to district agricultural officers or government planning departments. Wrapping the ML pipeline in an accessible interface is what separates a research experiment from a deployable policy tool. Because the pipeline runs on GEE’s cloud infrastructure, adding a new district or crop year requires no local compute, no data download, and no code changes on the end user’s side.

Figure 9: Interactive map interface of the yield prediction app. The user draws a region of interest directly on the satellite map — here, showing southern Senegal — triggering the GEE image retrieval, histogram generation, and CNN-LSTM inference pipeline, which returns the district yield estimate.
What This System Makes Possible
Senegal produces over 1.5 million tonnes of cereals annually, with smallholder farmers accounting for most of that production across highly variable agroclimatic zones. The system built here offers concrete capabilities that survey-based approaches cannot match.
- National and regional agricultural ministries can generate department-level yield estimates before the harvest season ends, enabling pre-positioned food aid, import planning, and price stabilisation interventions weeks earlier than survey-based timelines allow.
- The SARIMA forecasting component can deliver preliminary yield signals mid-season using only NDVI time-series history and no current-season ground truth, giving policymakers an early warning system for shortfall years at near-zero marginal cost per forecast.
- The CNN-LSTM architecture is replicable to any crop with available ground truth in a climatically similar country, allowing transfer learning to bring new crop models online without requiring multi-year domestic labelling campaigns before the first usable prediction.
- The interactive Voila application brings district-level yield queries within reach of non-technical agricultural officers, removing the data science barrier that typically prevents ML outputs from reaching the people closest to the problem.
- The methodological framework for combining supervised deep learning and unsupervised time-series forecasting in a data-scarce environment is directly applicable to other Sub-Saharan countries with similar ground truth limitations and equivalent MODIS or Sentinel-2 coverage.
About the Project
This project was completed by Omdena in partnership with the Global Partnership for Sustainable Development Data (GPSDD), with a team of 30 AI engineers contributing across data engineering, model development, time-series analysis, and application deployment. The work applied satellite remote sensing, deep learning, and scalable cloud computing to address one of the most data-scarce and operationally significant prediction problems in Sub-Saharan agriculture: reliable crop yield estimation for smallholder-dominated landscapes.
If you are working on food security intelligence, satellite-based yield estimation, or agricultural AI systems in data-scarce environments, reach out to the Omdena team to discuss how this methodology applies to your context.



