UHI-Explorer

Interactive satellite ML classifier for urban heat islands.

Three cities, three different ML models, each chosen for the specific physics of its UHI pattern. Rio's heat is driven by thermal radiance and building density. Santiago's is controlled by elevation and thermal inversions. Freetown has no labeled data at all, so the classifier transfers knowledge from the other two using an ensemble with specialist routing.

0.96

Best F1

Features

64,255

Grid Points

Cities

View on GitHub Live Explorer

Classification performance

0.00F1 Score

Rio de Janeiro

XGBoost

28,488 points

0.00F1 Score

Santiago

Random Forest

21,662 points

0.00F1 Score

Freetown

RF + XGBoost Ensemble

14,105 points

From pixels to predictions

Satellite Extraction

UHI-Pipe pulls Sentinel-2 (optical), Landsat-8 (thermal), Copernicus DEM (elevation), and building footprints from Microsoft Planetary Computer. 100m grid, 5×5 pixel median.

Rio de Janeiro

XGBoost

Thermal radiance dominates Rio's UHI pattern. The city's dense urban core traps heat in concrete and asphalt, creating a clear spectral signature that XGBoost separates with high confidence. Building morphology features (compactness, sky view factor) add the physical mechanism that spectral bands alone cannot capture.

XGBoost

28,488 grid points at 100m resolution

0.96

F1 Score

n_estimators

300

max_depth

learning_rate

0.1

eval_metric

mlogloss

Iterative error correction on Medium-class boundary pixels. Boosting corrects exactly where Random Forest plateaus, pushing F1 from 0.947 to 0.959.

Top features

LST

22.0%

LST × NDVI

18.0%

LST × NDBI

14.0%

NDMI

10.0%

Building Compactness

6.0%

NDBI

5.0%

Elevation

4.0%

NDVI

4.0%

Ablation study

Spectral only (5 bands)

93.4%

+ Spectral indices

+0.2%93.6%

+ All features (RF)

+1.1%94.7%

+ All features (XGBoost)

+1.2%95.9%

Confusion matrix

Low

Medium

High

Predicted

Low

Medium

High

Actual

4,521

102

3,845

4,298

Rio de Janeiro Classifications28,488 grid points, 91 mismatches

High

Medium

Low

Only 91 mismatches in 28,488 grid points (99.7% agreement). Most errors cluster at coastal edges where water adjacency creates mixed pixels, and at forest-urban transitions where the 500m extraction footprint blends classes.

Santiago

Random Forest

Elevation dominates Santiago's UHI pattern, not thermal radiance. The city sits in a basin surrounded by the Andes, creating thermal inversion zones where hot air traps under cooler air at altitude. A 6-8°C temperature gradient can occur over short distances, making the Medium class genuinely ambiguous in spectral space.

Random Forest

21,662 grid points at 100m resolution

0.69

F1 Score

n_estimators

500

max_depth

min_samples_split

ccp_alpha

0.001

class_weight

balanced

Medium class is 49.9% of data. Unconstrained boosters overfit to the majority class. Constrained RF with cost-complexity pruning keeps per-class F1 balanced at 0.68-0.70.

Top features

Elevation

20.4%

Elev × LST

17.4%

LST

9.0%

NDVI

7.0%

LST × NDBI

6.0%

NDBI

5.0%

Albedo

4.0%

NDMI

4.0%

Per-class F1 score

High

0.68

Medium

0.70

Low

0.69

All three classes perform within 2 percentage points of each other. The balanced performance means the model is not sacrificing minority classes for overall accuracy.

Confusion matrix

Low

Medium

High

Predicted

Low

Medium

High

Actual

1,382

206

199

3,573

323

292

1,584

Santiago Classifications21,662 grid points

High

Medium

Low

Per-class F1 is remarkably balanced (High: 0.68, Medium: 0.70, Low: 0.69). No single class dramatically underperforms. The 69% ceiling is set by landscape physics, not algorithm choice. Thermal inversions create zones where ground truth itself is uncertain.

Freetown

Transfer Learning

Freetown has no labeled UHI data. The pipeline trains on Rio and Santiago, then transfers predictions to Freetown using only the 5 spectral bands that generalize across climates. All 13 spectral indices degraded transfer because their physical meaning is climate-specific. Building footprint data is unavailable for Freetown, eliminating 15 features.

Why single-source transfer fails

Leave-One-City-Out cross-validation shows that training on one city and predicting another produces unusable results. Climate, vegetation type, and urban morphology are too different between cities.

Chile → Brazil

0.467

gap: -27.5%

Brazil → Chile

0.360

gap: -54.1%

Feature reduction

Total features

13 indices + 15 building + 6 interaction + 5 bands

Transferable bands

lwir11LSTSWIR2_NIRbluenir08

PCA(3)

Components retained

96% variance explained

All 13 spectral indices degraded transfer because their physical meaning is climate-specific. Rio's NDVI measures tropical broadleaf; Santiago's measures Mediterranean scrub. Building footprint data is unavailable for Freetown, eliminating 15 features. Only raw spectral bands generalize.

Specialist routing

Instead of one model for all classes, the ensemble routes each UHI class to the specialist model that handles it best. The routing is determined by which source city's training data best represents each class.

High UHI

Brazil XGBoost

Tropical thermal core matches informal settlements with dense impervious surfaces.

Medium UHI

Chile RF

Peri-urban transitions and mixed land cover are better captured by the constrained model.

Low UHI

Combined RF+XGB

Both models agree on cool vegetated surfaces. Ensemble averages for stability.

RF + XGBoost Ensemble

14,105 grid points at 100m resolution

0.58

F1 Score

RF n_estimators

300

RF max_depth

XGB n_estimators

300

XGB max_depth

XGB learning_rate

0.1

No ground-truth labels available. Single-source transfer fails badly (Chile→Brazil F1=0.467, Brazil→Chile F1=0.360). An ensemble routes each UHI class to the specialist model that handles it best.

Top features

lwir11

31.0%

LST

24.0%

SWIR2_NIR

19.0%

blue

14.0%

nir08

12.0%

Important caveat

The F1 score of 0.58 is validated against KMeans clustering pseudo-labels, not ground truth. There are no labeled UHI zones for Freetown. The spatial pattern is physically plausible (High in coastal lowland informal settlements, Low in forested uplands), but true classification accuracy is unknown. Nelder-Mead probability calibration adjusts score offsets to match approximate known prevalence before assigning hard labels.

Freetown UHI PredictionsEnsemble output, no ground truth available

High

Medium

Low

F1 of 0.58 is measured against KMeans clustering pseudo-labels, not ground truth. The spatial pattern is physically plausible: High UHI in coastal lowland informal settlements, Low in forested uplands. But true accuracy is unknown.

This project was a collaborative effort — built as part of a team research initiative combining remote sensing, machine learning, and urban climate analysis.

Technologies

Satellite MLGeospatialPythonEnvironmental

UHI-Pipe

View Live Explorer