Data Engineering

AI for Solar Energy Adoption in Sub-Saharan Africa

December 4, 202511 min readUpdated December 8, 2025

Omdena

AI for Solar Energy Adoption in Sub-Saharan Africa

Sub-Saharan Africa faces a critical energy deficit. More than 600 million people live without reliable electricity, and demand is rising by 11 percent annually—the fastest rate worldwide. Over the next two decades, power needs are expected to reach 390 TWh, with solar projected to supply roughly 70 percent of that growth. As photovoltaic costs continue to decline, solar energy stands out as the most viable, scalable, and sustainable path to electrify communities where extending the grid is neither practical nor affordable.

To help accelerate this transition, Omdena partnered with Zimbabwe-based NeedEnergy to apply artificial intelligence to solar planning and management. The collaboration resulted in two powerful dashboards: one that supports long-term decision-making by sizing PV systems and estimating financial returns, and another that provides short-term forecasts to help users balance demand and generation. By combining data science, physics-based models, and modern web technologies, the project demonstrates how intelligent tools can unlock cleaner, more reliable power for millions.

Bringing Solar Data and AI Together

Solar energy is a promising renewable resource for a climate‑friendly future that reduces reliance on finite fossil fuels. Yet converting sunshine into usable electricity requires careful planning. In markets like Harare, Zimbabwe, there is little public data on energy consumption, and prospective adopters need robust tools to evaluate whether going off grid makes sense. To overcome these barriers, the Omdena–NeedEnergy team built dashboards that combine meteorological information, consumption data, and open‑source libraries to help users size PV systems and plan around their expected savings.

The PV Sizing Tool: Long‑Term Planning

The first dashboard is a long‑term planning tool for Harare that helps prospective customers estimate how much solar energy their PV installation will produce over the next 20 years, the typical life span of solar panels. In addition to estimating generation, the tool calculates the number of panels needed to meet demand and compares the installation cost with projected savings to assess return on investment. This type of PV sizing application is common in countries with mature solar markets, but it is novel in Zimbabwe because of the limited availability of consumption data.

Construction of solar panel array structure (Source: Wikimedia Commons)

Data Collection and Wrangling

Building the PV sizing tool starts with assembling the right datasets. In this case the model draws on three primary data sources: energy consumption from NeedEnergy’s clients (accessed through their proprietary API); solar irradiance measurements for Harare from Solcast averaged over 14 years (2007 – 2021) and saved as the Mean Meteorological Year; and panel and inverter specifications from the PVlib Python library.

The irradiance record includes three key variables—Diffuse Horizontal Irradiance (DHI), Direct Normal Irradiance (DNI) and Global Horizontal Irradiance (GHI)—which are essential for modeling PV performance. (For a primer on solar irradiance terminology, see this article from the National Renewable Energy Laboratory.) Researchers merged the historical irradiance record with the client consumption data using the Pandas library and set the date column as the index. After this preparation, the tool offers two options to users: estimate potential savings or estimate the size of a PV array for their home or business.

The User Input Display for the PV Sizing Tool (Source: Omdena)

The interface shown above allows users to choose a panel type, inverter type, price per installed watt, time horizon and number of panels before running the calculations. It is designed to be intuitive for prospective customers who may not have technical expertise.

Pipeline for the PV Sizing Tool - Omdena — Pipeline for the PV Sizing Tool

Modeling Solar Energy Production

Once the input data have been prepared, the tool can model how much electricity a PV system will generate. It has two main purposes: calculating net savings and sizing the PV system. The first purpose estimates the difference between avoided utility bills and the initial investment over the chosen time horizon, given the system cost, the energy consumption profile and the assumed electricity price. The second purpose computes how many panels of a given type are required to meet demand based on the consumption history and the average irradiance profile.

The dashboard relies on PVLIB to simulate PV system performance. PVLIB is an open‑source package developed at Sandia National Laboratories that provides functions for modeling irradiance, temperature effects, and PV component behaviour. Due to limited data availability in Harare, the team made several assumptions: energy demand is periodic, the growth rate of demand is treated as constant, and seasonality is not fully captured because much of NeedEnergy’s API data covers less than one year.

First, the installation parameters such as module type, inverter type and temperature model are retrieved from PVLIB:

// import pvlib

sandia_modules = pvlib.pvsystem.retrieve_sam(‘SandiaMod’)
sapm_inverters = pvlib.pvsystem.retrieve_sam(‘cecinverter’)

module = sandia_modules[module_name]
inverter = sapm_inverters[inverter_name]

temperature_model_parameters = pvlib.temperature.TEMPERATURE_MODEL_PARAMETERS[‘sapm’][‘open_rack_glass_glass’]

system = {‘module’: module,
‘inverter’: inverter,
‘surface_azimuth’: 180}

Next, the meteorological context is specified. The tool calculates the sun’s position and determines the air mass and atmospheric pressure at Harare’s latitude (−17.824858) and longitude (31.053028) with an altitude of 1 490 m:

altitude = 1490
latitude = -17.824858
longitude = 31.053028
times = data_frame.index

system[‘surface_tilt’] = latitude

solpos = pvlib.solarposition.get_solarposition(times, latitude, longitude)
dni_extra = pvlib.irradiance.get_extra_radiation(times)
airmass = pvlib.atmosphere.get_relative_airmass(solpos[‘apparent_zenith’])
pressure = pvlib.atmosphere.alt2pres(altitude)
am_abs = pvlib.atmosphere.get_absolute_airmass(airmass, pressure)

Using these angles and irradiance values, the code computes the effective irradiance on the panel surface and converts it to direct current (DC) and alternating current (AC) power. The parameter number_modules corresponds to the number of installed panels:

aoi = pvlib.irradiance.aoi(system[‘surface_tilt’], system[‘surface_azimuth’],
solpos[‘apparent_zenith’], solpos[‘azimuth’])
total_irrad = pvlib.irradiance.get_total_irradiance(
system[‘surface_tilt’], system[‘surface_azimuth’],
solpos[‘apparent_zenith’], solpos[‘azimuth’],
data_frame[‘Dni’], data_frame[‘Ghi’], data_frame[‘Dhi’],
dni_extra=dni_extra, model=’haydavies’)

tcell = pvlib.temperature.sapm_cell(
total_irrad[‘poa_global’], temp_air, wind_speed,
**temperature_model_parameters)

effective_irradiance = pvlib.pvsystem.sapm_effective_irradiance(
total_irrad[‘poa_direct’], total_irrad[‘poa_diffuse’],
am_abs, aoi, module)
dc = pvlib.pvsystem.sapm(effective_irradiance, tcell, module)

ac_power = np.maximum(number_modules *
pvlib.inverter.sandia(dc[‘v_mp’], dc[‘p_mp’], inverter), 0)

Calculating Net Estimated Savings

Finally, the tool calculates the net savings for a PV installation by comparing the cost of purchasing electricity with the cost of installing panels. It takes as input the merged dataframe of irradiance and consumption, the time horizon, price per watt, number of panels, chosen module and inverter types, and the retail price of electricity per kilowatt‑hour:

// power_savings = np.minimum(data_frame[‘consumption’],
panels_count * estimated_generation_by_unit)
mean_hourly_power_savings_by_day = power_savings.groupby(
power_savings.index.date).mean()
expected_hourly_power_savings = mean_hourly_power_savings_by_day.mean()
potential_energy_savings = expected_hourly_power_savings * time_horizon * 365 * 24
watts_per_panel = int(panel_type.split(‘_’)[3][0:3])
initial_investment = panels_count * watts_per_panel * price_per_watt
total_invoice_reduction = potential_energy_savings * (price_per_kwh / 1000)
potential_savings = total_invoice_reduction – initial_investment

When users select a time horizon and enter their system details, the dashboard returns interactive plots. The red curve represents the estimated solar generation, while the blue curve represents historical energy consumption. Because the plots are built with the [Plotly library], users can zoom in to examine specific days or zoom out to view longer trends.

Resulting plots of solar energy produced vs energy demand for a client in Harare (Source: Omdena)

Another chart depicts the daily variation in energy demand and predicted production. Interactive charts help prospective clients understand whether the chosen panel configuration will meet their needs throughout the year.

Plot capturing the daily variation in energy demand and solar energy production (Source: Omdena)

The Energy Alert Tool: Short‑Term Forecasting

While the PV sizing tool supports long‑term planning, the second dashboard serves existing solar customers by predicting energy demand and generation over the next 36 hours. This short‑term dashboard provides alerts when the forecasted solar output is likely to fall short of demand so that users can adjust their consumption or switch to backup sources. It was built specifically for Harare, where consumption data were available and where the Solcast API can provide reliable seven‑day irradiance forecasts.

Pipeline for the Energy Alert Tool (Source: Omdena)

The tool draws on three sources:

The short‑term dashboard relies on three ingredients: historical consumption data from NeedEnergy’s API; a seven‑day solar irradiance forecast for Harare from the Solcast API that is used to generate hourly predictions; and panel and inverter specifications (including tilt angle, azimuth angle and panel count) from the PVLIB Python library.

The dashboard interface allows users to specify these parameters. It then forecasts demand and generation and displays whether the solar array will meet demand over the next week.

Input section for the short term dashboard (Source: Omdena) — short term dashboard

Energy Demand Forecasting with LightGBM

Predicting future consumption requires sophisticated modelling. The team chose LightGBM, a fast, memory‑efficient gradient boosting library that performs well on large datasets (see the lightgbm package documentation for more information). Because each client has unique consumption patterns, the model trains a separate forecaster for each user and for each of the 36 forecasting hours. This approach avoids the complexity of fitting a single model across all clients and allows the system to capture individual behaviour more accurately.

To reduce overfitting and simplify the model, the team employed a pruning strategy based on SHAP values, which measure how much each input contributes to the model’s output. Initially, the training set includes the previous 72 hours of consumption, the weekday, and the time of day. Two pruning steps are applied: Initially the training set includes the previous 72 hours of consumption, the weekday, and the time of day. To simplify the model the team applies two pruning steps. First, after training an initial LightGBM model the input features are ranked by their SHAP values, and only the 20 most important variables are retained. A helper function sorts features by importance:

def keep_importants(cols, importances, size=20):
important_index = np.argsort(importances)[::-1][:size]
important_features = cols[important_index]
return important_features

Second, the remaining variables are re‑evaluated by normalizing their SHAP values relative to the largest contribution. Features contributing less than 5 percent of the maximum importance are removed:

def keep_by_percentage(cols, importances, percentage=0.05):
largest_importance = np.max(importances)
normalized_importance = importances / largest_importance
mask = normalized_importance > percentage
important_features = cols[mask]
return important_features

The pruning process is akin to the recursive feature elimination technique ([RFE]), with the added benefit of SHAP values for interpretability. One caveat is that SHAP values may distribute the influence of a single correlated effect across multiple variables, causing important effects to appear less significant. Care must be taken when interpreting SHAP‑based pruning.

The table below summarizes which features remain at each stage of the pruning process. Each cell lists short phrases rather than full sentences to preserve readability.

Pruning stage	Features kept
Initial set	previous 72‑h consumption; weekday name; time of day
After step 1	twenty most important features by SHAP value
After step 2	features with ≥5 % of normalized SHAP importance

Solar Energy Production Forecast with PVLIB

Once demand has been forecasted, the next challenge is to estimate how much electricity the PV system will generate over the same horizon. The tool queries the Solcast API through the [solcast python library] to retrieve irradiance forecasts with 30‑minute resolution:

// import solcast

latitude = -17.824858
longitude = 31.053028
API_KEY = # Place your API_KEY here

data = solcast.get_radiation_forecasts(latitude, longitude, API_KEY)
seven_day_forecast = data.forecasts
data_df = pd.DataFrame(seven_day_forecast)

The team then repeats the PVLIB methodology described in the PV sizing section to estimate AC power generation. The time horizon for modelling solar production matches the 36‑hour demand forecast. After calculating both demand and generation, the tool compares the two series and issues alerts when predicted production falls below projected consumption by a predefined percentage.

In the snapshot above, the historical energy demand is shown in blue, the forecasted demand for the next 36 hours appears in purple, and the predicted solar production (given a chosen number of panels) is shown in red. Alerts notify users when the PV system will not meet demand so that they can plan accordingly.

Building and Deploying the Dashboards

The user interface for both dashboards is built with Streamlit, a Python framework for building data apps, and the applications are deployed on Heroku. Readers interested in deploying similar tools may consult the [Streamlit Tutorial: Deploying an AutoML Model Using Streamlit and the tutorial by Navid Mashinchi, “A quick tutorial on how to deploy your Streamlit app to Heroku”, which explain these steps in detail.

Demo Dashboards

The figure below shows a demo of the dashboards in action. Users can explore how different panel configurations affect long‑term savings and monitor short‑term demand and generation forecasts in near real time.

Conclusion

Machine learning and physics‑based modelling can be powerful allies in the drive toward renewable energy adoption. By combining consumption data, irradiance records, and open‑source libraries such as PVLIB and LightGBM, the Omdena–NeedEnergy project demonstrates that it is possible to build practical tools for both long‑term planning and short‑term operational awareness. These dashboards are especially valuable in regions like Sub‑Saharan Africa, where electrification needs are urgent and reliable data are scarce. Interested readers can explore the dashboards using the link provided by Omdena and NeedEnergy.

If you want to expand solar adoption with confidence, reduce planning uncertainty, and improve system reliability, Omdena can co-build AI tools that convert complex data into clear electrification decisions.

FAQ

How can AI accelerate solar adoption in Sub-Saharan Africa?

AI enables precise planning, forecasting, and performance monitoring of PV systems. It fills data gaps, reduces financial uncertainty, and helps communities understand how much solar power they can generate and save, making adoption more practical and confident.

What are the biggest challenges to solar deployment in Sub-Saharan Africa, and how does AI help?

Low availability of consumption data, unpredictable demand, and limited financial planning tools are major barriers. AI solves these by analyzing consumption patterns, weather forecasts, and irradiance to model generation, predict shortages, and estimate ROI.

What is the PV Sizing Tool and who benefits the most from it?

The PV Sizing Tool is a long-term planning dashboard that estimates solar generation, number of panels needed, cost, and projected savings. Individuals, businesses, and energy planners benefit by making smarter investment decisions.

How accurate are the AI forecasts and models?

The system combines 14 years of irradiance data, client consumption history, Solcast forecasts, and robust models like PVLIB and LightGBM. While assumptions are required, the approach provides reliable predictions for both long-term and short-term planning.

Can these dashboards be adapted to other countries beyond Zimbabwe?

Yes, the methodology is transferable. By integrating regional irradiance datasets, local electricity pricing, and consumption patterns, similar dashboards can be deployed in Nigeria, Kenya, Ghana, South Africa, and other fast-growing solar regions.

What makes the short-term Energy Alert Tool valuable for existing solar users?

It predicts potential mismatches between energy demand and solar production 36 hours ahead and sends alerts. This helps users prepare backup options, adjust usage, and prevent outages—boosting reliability.

Why use LightGBM for energy demand forecasting?

LightGBM handles large datasets efficiently, requires fewer resources, and offers high predictive performance. Combined with SHAP pruning, it captures individual consumption behavior without overfitting.

What requirements are needed to implement similar AI-solar solutions?

You need basic consumption data, historic or forecast irradiance values, PV system specifications, and open-source tools like PVLIB, Pandas, Plotly, and LightGBM. Cloud deployment platforms (e.g., Streamlit, Heroku) make delivery scalable.

Continue with Omdena

Share this article

Share on LinkedIn, send by email, or copy the direct link.

LinkedIn Email

Agriculture

From Orbit to Harvest: Inside TerraYield, a Multimodal Dataset for Smarter Crop Yield Forecasting

July 3, 2026

Computer Vision

AI-Powered Rooftop Solar Assessment: How Computer Vision Eliminates the 30-40% Pre-Sales Survey Cost

May 27, 2026

Machine Learning

8 Best Streamlit Machine Learning Web App Examples in 2026

May 5, 2026