Predict air auality with satellite imagery

Air pollution is one of the biggest environmental risks to public health, affecting millions worldwide. Cities and policymakers rely on accurate air quality predictions to implement preventive measures, reduce health risks, and comply with regulations. But how do we best predict air quality—using traditional data models or cutting-edge AI?
At Growing pAI, we partnered with the European Environment Agency (EEA) and Microsoft to explore this question. Our goal? To compare deep learning methods with traditional machine learning approaches to assess efficiency, accuracy, and real-world business impact.
The two approaches: AI vs. traditional model
We tested two distinct methodologies to predict key air quality metrics (CO₂, O₃, and NOₓ concentrations):
🔹 Deep learning approach: Using Keras, we trained a Convolutional Neural Network (CNN) to analyze satellite imagery and extract air pollution patterns. CNNs can capture complex spatial relationships that traditional models might miss.
🔹 Traditional Machine Learning Approach: We leveraged structured metadata—sensor readings, weather conditions, and geographical factors—enhanced with domain-driven heuristics. This method is widely used in environmental modeling due to its interpretability and reliability.
Key findings: When is AI worth it?

After extensive testing, we made several important discoveries:
✅ Traditional Models Held Their Ground: Surprisingly, our metadata-driven model performed comparably to the deep learning approach. Carefully engineered heuristics bridged much of the gap in accuracy.
✅ Deep Learning Requires Heavy Resources: While CNNs offered marginal improvements, they demanded significantly more computational power and training time.
✅ AI Captures Spatial Patterns But Needs More Data: The deep learning approach showed promise in detecting intricate spatial trends, but traditional models remained strong due to their explainability and lower overhead.
MLflow on Databricks: Seamless Training and Deployment

To manage our model experiments efficiently, we leveraged MLflow on Databricks. This enabled structured tracking of training runs, hyperparameter tuning, and model performance comparison. Ultimately, we deployed our best-performing model on Azure Machine Learning Service, streamlining the entire workflow from experimentation to production.
Databricks proved to be an essential platform for this project, offering seamless integration between data engineering and data science. Recognizing the value of this ecosystem, Axel conducted a five-day training session to help teams understand the full potential of Databricks for both data science and engineering tasks. This training empowered participants to efficiently handle big data workflows and build scalable AI solutions.
Business takeaways: should you always use AI?

While AI is often seen as the gold standard, our findings suggest that traditional machine learning models can still be highly effective, if designed with strong domain expertise. Before adopting AI for air quality prediction, businesses should consider:
- Data Readiness: Is your dataset robust and diverse enough to justify the use of AI for improved predictions?
- Cost vs. Benefit: Could a well-optimized traditional machine learning model offer similar accuracy with lower costs and complexity?
- Regulatory Needs: Does your use case require high interpretability for compliance or regulatory oversight, making AI less practical?
AI continues to evolve, and we see exciting opportunities to refine hybrid approaches—combining the best of deep learning and traditional modeling. By strategically integrating AI where it truly adds value, businesses and policymakers can make smarter, data-driven decisions.
At Growing pAI, we specialize in building AI solutions that make a real impact. Want to explore how AI can transform your environmental initiatives? Let’s talk: via mail axel@growingpai.com or phone +32 475 54 2216.