UHI-Pipe
A satellite data extraction pipeline that works for any location on Earth.
You give it a bounding box, a time window, and a set of lat/lon coordinates. It pulls imagery from four satellite and geospatial sources, cleans and composites the data, computes spectral indices and land surface temperature, and hands you back a single ML-ready DataFrame. No API keys. No manual downloads. Works anywhere Sentinel-2 and Landsat-8 have coverage.
19
Spectral Indices
4
Data Sources
~44
Output Columns
How it works
Data Sources
Sentinel-2, Landsat-8, Copernicus DEM, Building Footprints via Microsoft Planetary Computer
Point Extraction
Download only pixels at your coordinates, apply cloud and shadow masking
Feature Engineering
Compute 19 spectral indices + LST with emissivity correction
Smart Caching
Parquet-based cache. Static sources extracted once, not per resolution
ML-Ready Output
One DataFrame, one row per point, ~44 columns, ready for classification
The pipeline connects to Microsoft Planetary Computer, a free STAC catalog that hosts petabytes of satellite data. It downloads only the pixels at your specific coordinates, applies cloud and quality masking, composites across your time window, and caches everything so re-runs are instant.
Sentinel-2 L2A
10–60m resolutionSentinel-2 is the backbone of the pipeline. It captures 11 spectral bands ranging from visible light to shortwave infrared, giving us enough spectral resolution to distinguish vegetation from concrete, water from bare soil, and everything in between. The L2A product is already atmospherically corrected, so the reflectance values represent surface properties rather than atmospheric noise.
Search STAC catalog for scenes within bounding box and time window (max 30% cloud cover)
Apply SCL (Scene Classification Layer) mask: remove clouds (class 8, 9), shadows (3), snow (11), saturated (1), nodata (0)
Resample all bands to target resolution using nearest-neighbor interpolation
Compute median composite across all valid scenes in the time window
Extract pixel values at each sample point using 5×5 neighborhood median (not just nearest pixel)
Set raw DN = 0 to NaN before any computation
19 features extracted
The 5×5 neighborhood median is important. Rather than extracting just the nearest pixel to each coordinate, the pipeline takes the median value in a 5×5 grid around the point. This reduces noise from GPS coordinate error and sub-pixel heterogeneity, at the cost of smoothing out features smaller than 5 pixels across.
Spectral index library
The 19 features above come from these formulas. Each index is a mathematical combination of Sentinel-2 bands designed to isolate a specific surface property. Vegetation indices use the contrast between red absorption and near-infrared reflection. Built-up indices exploit how impervious surfaces reflect shortwave infrared. Water indices flip the vegetation logic.
Landsat-8 Collection 2
30m visible, 100m thermalLandsat-8 matters here for one reason: thermal infrared. Sentinel-2 does not have a thermal band, so it cannot measure surface temperature directly. Landsat's Band 10 captures longwave infrared radiation emitted by the ground, which is how we derive land surface temperature. The visible and NIR bands also give us a second, independent NDVI measurement that feeds into the emissivity correction.
Search for Collection 2 Level 2 scenes (max 50% cloud cover)
Apply QA_PIXEL bitmask: remove cloud, cloud shadow, snow, fill
Visible/NIR bands are surface reflectance (already atmospherically corrected)
Thermal band (lwir11) is brightness temperature in Kelvin, converted to °C
Median composite across time window
Emissivity derived from Landsat NDVI using Sobrino et al. (2004) thresholds
LST computed via mono-window correction using brightness temperature and emissivity
3 features extracted
Mono-Window Correction
Where BT is brightness temperature from Landsat Band 10, λ = 10.895 μm, ρ = 14388 μm·K, and ε is emissivity derived from NDVI.
Without emissivity correction, the thermal band gives you brightness temperature, what the sensor sees at the top of the atmosphere. The mono-window method adjusts for how efficiently different surfaces emit radiation. Bare soil (ε ≈ 0.97) emits differently than full vegetation (ε ≈ 0.99). The pipeline derives emissivity from Landsat NDVI using Sobrino et al. (2004) thresholds, then corrects the brightness temperature to get actual ground temperature in °C.
Copernicus GLO-30 DEM
30m staticElevation matters because higher ground is cooler. Hilltop neighborhoods experience different heat dynamics than low-lying river valleys, even when the land cover is identical. The Copernicus GLO-30 DEM provides consistent global elevation data at 30-meter resolution. Unlike the satellite imagery sources, this is a static dataset. No date parameter, no cloud masking, no compositing. The pipeline downloads the tiles once, mosaics them, and extracts the elevation at each point.
Query STAC catalog for GLO-30 DEM tiles covering the bounding box
Mosaic tiles into a single elevation raster
Reproject to WGS 84 and resample to target resolution
Extract elevation at each sample point (nearest pixel, no neighborhood median needed)
Cache result per city. DEM is static, so the cache key ignores date and resolution
1 feature extracted
Because the DEM is static, it gets special treatment in the caching layer. During a resolution sweep (running 7 resolutions across 3 cities), the pipeline extracts DEM once per city instead of 21 times. The cache key ignores resolution and date parameters for this source.
Building Footprints
Vector, 100m bufferSatellite imagery tells you what the surface looks like from above, but it cannot tell you how tall the buildings are or how densely packed they are. Building footprint data fills that gap. The pipeline accepts either a pre-computed CSV with density values or a raw shapefile of building polygons. When given a shapefile, it computes building density within a configurable buffer (default 100m) around each sample point.
Load building polygons from shapefile or pre-computed CSV
Build spatial index (R-tree) over all footprints for fast lookup
For each sample point, buffer by 100m and count intersecting footprints
Compute building density as footprint count per buffer area
Merge density values into the main DataFrame, filling NaN where coverage is missing
Features extracted
Building data is the most uneven source in terms of global availability. 3D-GloBFP has good coverage for major cities but gaps elsewhere. When building data is unavailable, the pipeline runs without it. The corresponding columns return NaN, and the rest of the DataFrame is unaffected. This makes the building source optional rather than a hard dependency.
Usage
All four sources converge into one function call, but each can also run independently. The tabs below show both approaches.
from uhi_pipe import extract_sentinelfrom uhi_pipe.features import sentinel_features# Extract raw bands at your coordinatesraw = extract_sentinel( points=my_points, bbox=(-33.88, -71.05, -33.23, -70.04), time_window="2024-01-15/2024-02-15", resolution=100, max_cloud=30,)# → Latitude, Longitude, B01-B12 (raw DN values)# Compute all 19 spectral indicesfeatures = sentinel_features(raw)# → Adds NDVI, EVI, SAVI, NDBI, Albedo, ... (19 columns)Pre-configured locations
Three cities ship as built-in configurations for quick testing. These are the cities from the UHI-Explorer classification project, but the pipeline works with any bounding box and coordinates.
Rio de Janeiro
Santiago
Freetown
Resolution sweep
The pipeline supports resolutions from 10m to 1000m. Below 30m, the fused dataset hits a floor. Landsat thermal is natively 100m (resampled to 30m), and the DEM is 30m. Sentinel-2 bands go down to 10m, but combining them with coarser sources introduces spatial mismatch at fine scales.
Assumptions and limitations
Minimum resolution
30m, limited by Landsat thermal band and DEM. Sentinel-2 bands go down to 10m, but fusing with Landsat at finer scales introduces spatial mismatch.
Temporal compositing
Median composite assumes surface conditions are stable across the time window. Works well for 1-2 month windows, breaks down for longer periods or rapid land-use change.
5×5 neighborhood median
Extracts the median pixel value in a 5×5 grid around each point, not just the nearest pixel. Reduces noise from GPS coordinate error and sub-pixel heterogeneity, but smooths out fine-grained features.
Cloud masking coverage
Persistently cloudy regions (tropics in wet season) may have few or no valid pixels after masking. The pipeline reports valid pixel counts so you know when data is thin.
Building data availability
3D-GloBFP coverage is incomplete globally. For cities without building footprints, the pipeline runs without the buildings source and returns NaN for building features.
Emissivity simplification
Mono-window LST uses NDVI-derived emissivity with three thresholds (bare soil, mixed, full vegetation). Real emissivity varies more than this, especially over water and rock.
Technologies
Next project
UHI-Explorer