
URBAN AIRQUALITY
Data Science / ML
NYIT
2024
Python / SUMO / XGBoost
AI-powered solution integrating Machine Learning and SUMO to predict and reduce urban air pollution in real-time.
The architecture leverages SUMO (Simulation of Urban Mobility) to generate high-fidelity traffic emission datasets. This data is then processed through a multi-stage ML pipeline to identify spatial-temporal pollution patterns with 85% predictive accuracy.
By applying gradient-boosted trees (XGBoost) and deep neural networks, we developed an optimization algorithm that dynamically redirects traffic flow. Simulations showed a theoretical 75% reduction in CO2 and NOx concentrations at critical intersections.
Tech Stack
Data Pipeline &
Model Training
Simulation & Generation
Utilizing SUMO to simulate 24-hour traffic cycles in Flushing, NY, generating granular emission data for CO, CO2, and NOx.
Feature Engineering
Preprocessing raw XML outputs into structured dataframes. Engineering temporal features (hour, day) and spatial clusters for hotspot analysis.
Hyperparameter Tuning
Implementing GridSearch and RandomSearch cross-validation to optimize model parameters, specifically focal loss for imbalanced pollution spikes.
