UHI-Explorer
Interactive satellite ML classifier for urban heat islands.
Three cities, three different ML models, each chosen for the specific physics of its UHI pattern. Rio's heat is driven by thermal radiance and building density. Santiago's is controlled by elevation and thermal inversions. Freetown has no labeled data at all, so the classifier transfers knowledge from the other two using an ensemble with specialist routing.
0.96
Best F1
39
Features
64,255
Grid Points
3
Cities
Classification performance
From pixels to predictions
Satellite Extraction
UHI-Pipe pulls Sentinel-2 (optical), Landsat-8 (thermal), Copernicus DEM (elevation), and building footprints from Microsoft Planetary Computer. 100m grid, 5×5 pixel median.
Rio de Janeiro
XGBoostThermal radiance dominates Rio's UHI pattern. The city's dense urban core traps heat in concrete and asphalt, creating a clear spectral signature that XGBoost separates with high confidence. Building morphology features (compactness, sky view factor) add the physical mechanism that spectral bands alone cannot capture.
XGBoost
28,488 grid points at 100m resolution
0.96
F1 Score
n_estimators
300
max_depth
8
learning_rate
0.1
eval_metric
mlogloss
Iterative error correction on Medium-class boundary pixels. Boosting corrects exactly where Random Forest plateaus, pushing F1 from 0.947 to 0.959.
Top features
Ablation study
Confusion matrix
Only 91 mismatches in 28,488 grid points (99.7% agreement). Most errors cluster at coastal edges where water adjacency creates mixed pixels, and at forest-urban transitions where the 500m extraction footprint blends classes.
Santiago
Random ForestElevation dominates Santiago's UHI pattern, not thermal radiance. The city sits in a basin surrounded by the Andes, creating thermal inversion zones where hot air traps under cooler air at altitude. A 6-8°C temperature gradient can occur over short distances, making the Medium class genuinely ambiguous in spectral space.
Random Forest
21,662 grid points at 100m resolution
0.69
F1 Score
n_estimators
500
max_depth
20
min_samples_split
25
ccp_alpha
0.001
class_weight
balanced
Medium class is 49.9% of data. Unconstrained boosters overfit to the majority class. Constrained RF with cost-complexity pruning keeps per-class F1 balanced at 0.68-0.70.
Top features
Per-class F1 score
All three classes perform within 2 percentage points of each other. The balanced performance means the model is not sacrificing minority classes for overall accuracy.
Confusion matrix
Per-class F1 is remarkably balanced (High: 0.68, Medium: 0.70, Low: 0.69). No single class dramatically underperforms. The 69% ceiling is set by landscape physics, not algorithm choice. Thermal inversions create zones where ground truth itself is uncertain.
Freetown
Transfer LearningFreetown has no labeled UHI data. The pipeline trains on Rio and Santiago, then transfers predictions to Freetown using only the 5 spectral bands that generalize across climates. All 13 spectral indices degraded transfer because their physical meaning is climate-specific. Building footprint data is unavailable for Freetown, eliminating 15 features.
Why single-source transfer fails
Leave-One-City-Out cross-validation shows that training on one city and predicting another produces unusable results. Climate, vegetation type, and urban morphology are too different between cities.
Chile → Brazil
0.467
gap: -27.5%
Brazil → Chile
0.360
gap: -54.1%
Feature reduction
39
Total features
13 indices + 15 building + 6 interaction + 5 bands
5
Transferable bands
PCA(3)
Components retained
96% variance explained
All 13 spectral indices degraded transfer because their physical meaning is climate-specific. Rio's NDVI measures tropical broadleaf; Santiago's measures Mediterranean scrub. Building footprint data is unavailable for Freetown, eliminating 15 features. Only raw spectral bands generalize.
Specialist routing
Instead of one model for all classes, the ensemble routes each UHI class to the specialist model that handles it best. The routing is determined by which source city's training data best represents each class.
High UHI
Brazil XGBoost
Tropical thermal core matches informal settlements with dense impervious surfaces.
Medium UHI
Chile RF
Peri-urban transitions and mixed land cover are better captured by the constrained model.
Low UHI
Combined RF+XGB
Both models agree on cool vegetated surfaces. Ensemble averages for stability.
RF + XGBoost Ensemble
14,105 grid points at 100m resolution
0.58
F1 Score
RF n_estimators
300
RF max_depth
10
XGB n_estimators
300
XGB max_depth
6
XGB learning_rate
0.1
No ground-truth labels available. Single-source transfer fails badly (Chile→Brazil F1=0.467, Brazil→Chile F1=0.360). An ensemble routes each UHI class to the specialist model that handles it best.
Top features
Important caveat
The F1 score of 0.58 is validated against KMeans clustering pseudo-labels, not ground truth. There are no labeled UHI zones for Freetown. The spatial pattern is physically plausible (High in coastal lowland informal settlements, Low in forested uplands), but true classification accuracy is unknown. Nelder-Mead probability calibration adjusts score offsets to match approximate known prevalence before assigning hard labels.
F1 of 0.58 is measured against KMeans clustering pseudo-labels, not ground truth. The spatial pattern is physically plausible: High UHI in coastal lowland informal settlements, Low in forested uplands. But true accuracy is unknown.
This project was a collaborative effort — built as part of a team research initiative combining remote sensing, machine learning, and urban climate analysis.