ClimaX via Radar

Generation Completed Aug 2025 – Dec 2025 Project Participant
Documents

Highlights

Results

Test Accuracy
62.1%
Multi-variable input (HSP + HSR + dBZ)
Single-variable Accuracy
56.7%
HSP-only forecasting
Forecast Horizon
72 hours
Regional rainfall prediction

Details

Regional radar-based rainfall forecasting using the ClimaX transformer with multi-variable radar inputs.

Problem

Accurate rainfall forecasting from radar data requires modeling highly non-linear spatiotemporal dynamics across multiple atmospheric variables.
Conventional numerical and heuristic methods struggle to jointly capture local convective structures and large-scale weather patterns, especially for longer forecast horizons.

This project investigates whether ClimaX, a transformer-based climate model, can be adapted for regional radar-based rainfall prediction.

Approach

Data

We used Korean radar composite data with the following characteristics:

  • Temporal resolution: 10-minute intervals
  • Spatial resolution: 100 m
  • Original grid size: 2305 × 2881
  • Projection: Lambert Conformal Conic (LCC)
  • Target region: Seoul metropolitan area

Radar variables:

  • HSP: Radar-estimated rainfall intensity (mm/hr)
  • HSR: Rainfall derived from radar reflectivity
  • HSR_dBZ: Raw radar reflectivity (dBZ)

Data preprocessing steps:

  • Values ≤ −20000 treated as missing
  • Rainfall values scaled by dividing by 100
  • Reflectivity converted using the Marshall–Palmer Z–R relationship
  • Spatial cropping and downsampling to 100 × 100 grids
  • Dataset-wide normalization
  • Construction of a climatology map for ClimaX conditioning

Days with missing radar channels were excluded.

Model

We adopted RegionalClimaX, a grid-to-grid transformer model originally designed for climate variables.

Model configuration:

  • Patch size: 4 × 4
  • Embedding dimension: 1024
  • Transformer layers: 8
  • Attention heads: 16
  • Positional embeddings enabled

Each radar variable is tokenized independently and aggregated via cross-variable attention, allowing the model to learn dependencies between reflectivity and rainfall intensity.

Training

  • Total samples: 188,071
  • Time span: June 2021 – December 2023
  • Train / validation / test split: 70% / 20% / 10%
  • Batch size: 32
  • Epochs: 10
  • Forecast horizon: up to 72 hours
  • Training from scratch on a single GPU
  • Runtime: approximately 40 minutes per epoch

Multiple configurations were explored, including single-variable and multi-variable inputs.

KlimaX Processing Pipeline.

Evaluation

Performance was evaluated using:

  • Test accuracy
  • Mean squared error
  • Qualitative inspection of spatial rainfall patterns

Key observations:

  • Multi-variable inputs improved accuracy over single-variable baselines
  • Large-scale rainfall structures were captured reliably
  • Fine-grained convective details were often smoothed
  • Patch-wise artifacts appeared in longer-horizon forecasts

Notes and Lessons Learned

  • ClimaX effectively models structured, multi-variable geophysical data
  • Multi-variable conditioning is critical for radar-based forecasting
  • Patch tokenization limits fine-scale detail in long-horizon predictions
  • Future improvements may include:
    • Multi-scale patching
    • Temporal attention refinement
    • Hybrid convolution–transformer front-ends

This project served as a foundation for further exploration of transformer-based weather and climate forecasting using radar observations.

Stack

Pytorch ClimaX Radar