Methodology

How TerraCube works, all the way down.

Most of the site keeps things plain on purpose. This page does the opposite: it explains, in order, exactly how a street address becomes the numbers in your report — which datasets, what each number means, which years it describes, how we handle uncertainty, and where the method has honest limits. It starts homeowner-readable and gets more technical as you scroll.

The short version

We don't invent numbers. Every value traces to a named, public scientific dataset — the same data climate scientists use — sampled at your point.
“Today” means a 30-year average, not last year's weather. Climate is the long-run pattern; a single hot or cold year doesn't change it.
The future is a range, not one line. We run many climate models and show you the middle estimate and how much they disagree — because pretending to a single exact future would be dishonest.
When we don't know, we say so. A point over the ocean or outside a dataset is marked “no data” rather than filled with a guess.
We check our work in public. We measured our baseline against independent observations and published the numbers — they're below, and the test is re-runnable by anyone.

Overview

How we work out your home’s outlook

At the top level it's four steps. You give us an address; we read the best public science at that exact point; we turn it into things a homeowner actually cares about — a climate type, heating and cooling costs, hazard signals and an overall grade — and hand you a plain report you can trace all the way back to its sources. Nothing is invented in between.

You give

An address

Turned into a point on the map (lat / lon)

We read

Public science, at that point

Today's climate normals
Future projections
This week's weather
Building & energy data

We work out

Your home's outlook

Climate type & shift
Heating & cooling cost
Hazard signals

You get

A plain report

A grade, a story, and the numbers behind it

Every stage is traceable: each number in the final report still carries the dataset, baseline years and (for the future) the model spread it came from.

1 · Inputs

The datasets behind each number

Different questions are answered by different, purpose-built datasets. We never stretch one dataset to cover a question it wasn't built for. The full list, with links, lives on the Our data page; the ones that drive the science are:

Today's climate normals — CHELSA v2.1. A peer-reviewed global climatology at roughly 1 km resolution, baseline 1981–2010. Source of annual mean temperature, seasonality and annual precipitation at your point.
Climate type — Köppen-Geiger. The climate-class map (baseline 1991–2020) that names a location's climate and shows how that class is projected to shift.
The future — NASA NEX-GDDP-CMIP6. Statistically downscaled CMIP6 projections for future 30-year windows (2041–2070 and 2071–2100) under low, medium and high emissions pathways. We use multiple models from this collection (see §4).
Heating & cooling — building-stock surveys. US energy surveys (RECS / CBECS) and the EU's measured-calibrated TABULA typologies, combined with the climate above to estimate a home's heating and cooling demand and cost.
This week's weather — NOAA & ECMWF. Public numerical weather models, refreshed each model cycle, for the 7-day forecast and near-term hazards. This is live forecast data; everything else above is a fixed, preprocessed climate baseline.

2 · Baselines

Which “today” a number describes

“Today’s climate” is always a 30-year average, because that is what separates climate from weather. A subtlety worth stating plainly: not every dataset uses the exact same 30 years, so a report can carry more than one “present” baseline at once. We label each number with the window it actually describes rather than forcing them to look identical:

Number	Baseline window	Why
Climate normals (temp, rainfall)	1981–2010	CHELSA v2.1's published baseline.
Climate type (Köppen)	1991–2020	The Köppen-Geiger product's baseline.
Future outlook	2041–2070 / 2071–2100	CMIP6 30-year projection windows.
This week's forecast	live	Current numerical weather model cycle.

A ~decade of offset between the 1981–2010 and 1991–2020 baselines is worth roughly a tenth to a few tenths of a degree of warming — small, but real, and we don't hide it by pretending the datasets are on the same clock.

3 · Validation

How good is the baseline, really?

A model is only as trustworthy as its track record. We compared TerraCube's present-day climatology against ERA5 reanalysis (via the free Open-Meteo archive) — an independent, observation-constrained dataset built by a different institution with a different method — at 30 globally distributed points, on the same 1981–2010 baseline so the comparison is fair. The full, re-runnable harness and per-point results are in the repository.

Each dot is one of 30 places worldwide, from Reykjavík to Bangkok. The closer a dot sits to the dashed line, the closer TerraCube is to the independent observation.

0.47 °C

Typical temperature error

On average our present-day temperature lands about half a degree from the independent reference.

0.998

Temperature correlation

Out of a perfect 1.0 — TerraCube ranks warm and cool places almost exactly right.

0.84

Rainfall correlation

Weaker, and we say so: rainfall is genuinely harder to model, especially in mountains and monsoons.

What those two words mean. Typical error (MAE) is how far off we are on an average place — smaller is better. Correlation is whether we get the pattern right (do warmer places come out warmer?) on a 0-to-1 scale, where 1 is perfect. The full table below adds bias (do we run consistently warm or cold?) and RMSE (which punishes big misses harder).

Variable	Bias	Typical error (MAE)	RMSE	Correlation
Annual mean temperature	+0.42 °C	0.47 °C	0.57 °C	0.998
Annual precipitation	+16 mm	203 mm	350 mm	0.84

In plain terms: our temperature baseline sits within about half a degree of an independent reference and ranks places almost perfectly (correlation 0.998). Precipitation agrees far less tightly — that is expected and honest, because rainfall is harder to model and the reference's rainfall is itself model-generated, especially in monsoon and mountain regions. We report both so you can weight them accordingly. A few degrees of disagreement in steep terrain is largely the reference's coarser resolution, not our error; the harness documents each source of expected disagreement.

See the full validation harness, sources of disagreement and per-point results →

4 · Uncertainty

Why the future is a range, not a line

No single climate model is “the” answer — each makes slightly different assumptions, so they disagree, and that disagreement is real information. Rather than quietly pick one model and present its output as certainty, TerraCube runs an ensemble of climate models from the NEX-GDDP-CMIP6 collection at your point and reports two things for every future value:

The central estimate — the median across the models, which is more robust than any one model alone.
The spread — the range the models cover, shown as an uncertainty band. A wide band means the science is genuinely less settled for that place and variable; a narrow band means the models largely agree.

This is the same band you see on the future charts in your report: one honest middle line, wrapped in how much the models disagree.

You also choose the emissions pathway (roughly: how much the world curbs greenhouse gases), because that choice, not model disagreement, is the single biggest lever on the later part of this century. The band describes model uncertainty within a pathway; switching pathways shows how much our collective choices still matter. Every projection in the report and API carries its central value and its spread — never a bare number pretending to a precision it doesn't have.

5 · Heating & cooling

How the energy estimate is built

A home's heating and cooling demand is driven by how far, and how long, outdoor temperature sits away from a comfortable indoor target. The standard measure is degree-days: add up, over a year, how many degrees below a heating base (heating degree-days) or above a cooling base (cooling degree-days) the temperature runs.

We compute degree-days from local monthly-mean temperature (CHELSA-based for today; the downscaled projection for the future) using Thom's method, which accounts for the fact that daily temperatures swing above and below the monthly mean — a naïve monthly calculation badly undercounts near the base temperature. Two calibration choices matter and are made explicit rather than hidden:

Regional base (balance-point) temperatures. The temperature at which a typical home needs neither heating nor cooling isn't a single global constant — it varies by region and differs for heating versus cooling. We use region-appropriate bases rather than one worldwide default.
Day-to-day variability. Thom's method needs an estimate of how much daily temperatures spread around the monthly mean; we vary that with latitude and season rather than assuming one fixed value everywhere.

The resulting degree-days are combined with building-stock survey data and local energy prices to translate into an estimated heating/cooling cost. It is a well-grounded estimate for a typical home of its type and age — not a metered audit of your specific building.

6 · Limits

Hazard signals — and what they can’t tell you

Some risks are reported as proxies — indicators derived from climate variables rather than a full physical hazard model. Proxies are genuinely useful for “is this getting worse here, and roughly how much?” but they have limits we state up front:

Heat stress is derived from temperature (and, where available, humidity) statistics — a strong indicator of trend and relative severity, but not a substitute for a local public-health heat-health warning.
Wildfire and drought signals are built from climate indices (dryness, heat, seasonality). They capture whether conditions are trending more fire- or drought-prone; they do not model fuel, vegetation, ignition or local suppression, so they aren't a parcel-level fire rating.
Flood is not a proxy we fake. True flood risk needs hydrology and elevation modeling we don't yet run, so we don't invent a flood score — consistent with the honest-gaps promise.

Every hazard signal on the site is labelled with what drives it, so you can tell a modeled quantity from a proxy indicator at a glance.

7 · Traceability

Provenance travels with every number

Each value in a report states, on the page, which dataset, baseline window and — for projections — which scenario and model ensemble produced it, along with the model spread. The same attribution is attached in the API response, so a number never travels without its source. If you can't trace a number, you shouldn't have to trust it.

Check your home — free See the full dataset list →