Atmospheric River Events
In this notebook, we want to explore Atmospheric River events in the REMO model data. We show how to integrate a 3D REMO field along the vertical axis using the pressure coordinate. We will sum the specific humidity \(Q_D\) along the vertical axis weighted by the pressure difference to show atmospheric river events in the model data.
The integral looks like this: $ Q_{vi} = \int_{bottom}^{top} Q_d ,dp$
Let’s fetch some REMO example output.
Workflow
[1]:
import pyremo as pr
tfile = pr.tutorial.load_dataset("remo_output_3d")
We define the formula for the computation of pressure levels here using the vertical hybrid sigma coefficients.
[2]:
def pressure(ak, bk, ps):
"""Returns pressure levels
Computes pressure levels from hybrid sigma coeffiecients
using the surface pressure ps.
"""
return ak + bk * ps
The hybrid sigma coefficients can be defined at level interfaces or mid levels depending on which locations we need. For this integration, we’ll use the pressure differences between level interfaces while the specific humidity is defined on the mid levels. Consequently, we’ll need to use hyai
and hybi
to compute the pressure field define on the level interfaces.
[3]:
aki = tfile.hyai # i: defined at level interfaces
bki = tfile.hybi
akm = tfile.hybm # m: defined at mid levels
bkm = tfile.hybm
[4]:
pi = pressure(aki, bki, tfile.PS) # pressure at level interfaces
pm = pressure(akm, bkm, tfile.PS) # pressure at mid levels
Let’s check, that the bottom level of pi
should be equal to the surface pressure.
[5]:
(pi[27] == tfile.PS).all()
[5]:
<xarray.DataArray ()> array(True)
- True
array(True)
That should be True
. Now, we’ll compute the pressure differential along vertical axis.
[6]:
dpi = pi.diff(dim="nhyi")
# do some xarray magic to make the pressure difference have the right level coordinate like
# the QD variable that we want to integrate.
dpi = dpi.rename({"nhyi": "lev"})
dpi["lev"] = tfile.lev
Now, let’s do the math. We’ll compute the integrant.
[7]:
dQVI = tfile.QD * dpi
Now we can easily integrate along the vertical axis.
[8]:
QVI = dQVI.sum(dim="lev")
Add consistent meta data
[9]:
QVI.name = "QVI"
QVI.attrs = tfile.QD.attrs
ds = QVI.assign_attrs(
units="kg/m/s",
standard_name="integrated_specific_humidity",
long_name="Integrated specific humidity",
).to_dataset()
ds["rotated_pole"] = tfile.rotated_pole
ds
[9]:
<xarray.Dataset> Dimensions: (rlat: 121, rlon: 129, time: 1) Coordinates: * time (time) datetime64[ns] 2006-01-01 * rlon (rlon) float64 -31.73 -31.29 -30.85 ... 23.71 24.15 24.59 * rlat (rlat) float64 -26.73 -26.29 -25.85 ... 25.19 25.63 26.07 Data variables: QVI (time, rlat, rlon) float64 29.37 29.85 29.65 ... 6.588 6.583 rotated_pole int32 1
- rlat: 121
- rlon: 129
- time: 1
- time(time)datetime64[ns]2006-01-01
array(['2006-01-01T00:00:00.000000000'], dtype='datetime64[ns]')
- rlon(rlon)float64-31.73 -31.29 ... 24.15 24.59
- standard_name :
- grid_longitude
- long_name :
- longitude in rotated pole grid
- units :
- degrees
- axis :
- X
array([-31.73, -31.29, -30.85, -30.41, -29.97, -29.53, -29.09, -28.65, -28.21, -27.77, -27.33, -26.89, -26.45, -26.01, -25.57, -25.13, -24.69, -24.25, -23.81, -23.37, -22.93, -22.49, -22.05, -21.61, -21.17, -20.73, -20.29, -19.85, -19.41, -18.97, -18.53, -18.09, -17.65, -17.21, -16.77, -16.33, -15.89, -15.45, -15.01, -14.57, -14.13, -13.69, -13.25, -12.81, -12.37, -11.93, -11.49, -11.05, -10.61, -10.17, -9.73, -9.29, -8.85, -8.41, -7.97, -7.53, -7.09, -6.65, -6.21, -5.77, -5.33, -4.89, -4.45, -4.01, -3.57, -3.13, -2.69, -2.25, -1.81, -1.37, -0.93, -0.49, -0.05, 0.39, 0.83, 1.27, 1.71, 2.15, 2.59, 3.03, 3.47, 3.91, 4.35, 4.79, 5.23, 5.67, 6.11, 6.55, 6.99, 7.43, 7.87, 8.31, 8.75, 9.19, 9.63, 10.07, 10.51, 10.95, 11.39, 11.83, 12.27, 12.71, 13.15, 13.59, 14.03, 14.47, 14.91, 15.35, 15.79, 16.23, 16.67, 17.11, 17.55, 17.99, 18.43, 18.87, 19.31, 19.75, 20.19, 20.63, 21.07, 21.51, 21.95, 22.39, 22.83, 23.27, 23.71, 24.15, 24.59])
- rlat(rlat)float64-26.73 -26.29 ... 25.63 26.07
- standard_name :
- grid_latitude
- long_name :
- latitude in rotated pole grid
- units :
- degrees
- axis :
- Y
array([-26.73, -26.29, -25.85, -25.41, -24.97, -24.53, -24.09, -23.65, -23.21, -22.77, -22.33, -21.89, -21.45, -21.01, -20.57, -20.13, -19.69, -19.25, -18.81, -18.37, -17.93, -17.49, -17.05, -16.61, -16.17, -15.73, -15.29, -14.85, -14.41, -13.97, -13.53, -13.09, -12.65, -12.21, -11.77, -11.33, -10.89, -10.45, -10.01, -9.57, -9.13, -8.69, -8.25, -7.81, -7.37, -6.93, -6.49, -6.05, -5.61, -5.17, -4.73, -4.29, -3.85, -3.41, -2.97, -2.53, -2.09, -1.65, -1.21, -0.77, -0.33, 0.11, 0.55, 0.99, 1.43, 1.87, 2.31, 2.75, 3.19, 3.63, 4.07, 4.51, 4.95, 5.39, 5.83, 6.27, 6.71, 7.15, 7.59, 8.03, 8.47, 8.91, 9.35, 9.79, 10.23, 10.67, 11.11, 11.55, 11.99, 12.43, 12.87, 13.31, 13.75, 14.19, 14.63, 15.07, 15.51, 15.95, 16.39, 16.83, 17.27, 17.71, 18.15, 18.59, 19.03, 19.47, 19.91, 20.35, 20.79, 21.23, 21.67, 22.11, 22.55, 22.99, 23.43, 23.87, 24.31, 24.75, 25.19, 25.63, 26.07])
- QVI(time, rlat, rlon)float6429.37 29.85 29.65 ... 6.588 6.583
- grid_mapping :
- rotated_pole
- variable :
- QD
- description :
- specific humidity
- units :
- kg/m/s
- layer :
- 110.0
- cf_name :
- hus
- code :
- 133
- standard_name :
- integrated_specific_humidity
- long_name :
- Integrated specific humidity
array([[[29.37055973, 29.84539326, 29.6528811 , ..., 14.46666023, 15.50615418, 13.73233009], [29.37838781, 29.20898637, 29.01380292, ..., 12.57590328, 13.4666962 , 13.69290153], [29.53339433, 29.23728129, 28.35914653, ..., 15.62453258, 18.20956416, 18.37268447], ..., [83.93557843, 83.95831155, 84.23090402, ..., 6.69006996, 6.61244881, 6.63206705], [86.2788733 , 86.28759534, 85.04863121, ..., 6.6652309 , 6.57409011, 6.59765059], [86.32654304, 86.32800913, 85.06982852, ..., 6.6745295 , 6.58848365, 6.5826197 ]]])
- rotated_pole()int321
- grid_mapping_name :
- rotated_latitude_longitude
- grid_north_pole_longitude :
- -162.0
- grid_north_pole_latitude :
- 39.25
array(1, dtype=int32)
Plot results of one timestep
[10]:
pole = (
ds.rotated_pole.grid_north_pole_longitude,
ds.rotated_pole.grid_north_pole_latitude,
)
pole
[10]:
(-162.0, 39.25)
[11]:
def plot_contourf(da, pole, vmin=None, vmax=None, levels=20):
%matplotlib inline
import cartopy.crs as ccrs
import cartopy.feature as cf
# use ncar colormaps from https://github.com/hhuangwx/cmaps
# conda install -c conda-forge cmaps
import cmaps
import matplotlib.pyplot as plt
plt.figure(figsize=(15, 10))
ax = plt.axes(projection=ccrs.RotatedPole(*pole))
ax.gridlines(
draw_labels=True,
linewidth=0.5,
color="gray",
xlocs=range(-180, 180, 10),
ylocs=range(-90, 90, 5),
)
da.plot.contourf(
ax=ax,
cmap=cmaps.WhiteBlueGreenYellowRed,
levels=levels,
transform=ccrs.RotatedPole(*pole),
vmin=vmin,
vmax=vmax,
)
ax.coastlines(resolution="50m", color="black", linewidth=1)
[12]:
plot_contourf(ds.QVI.isel(time=0), pole)
Summary: Structure our prototype code into functions
[13]:
def pressure(ak, bk, ps):
"""Returns pressure levels
Computes pressure levels from hybrid sigma coeffiecients
using the surface pressure ps.
"""
return ak + bk * ps
def integrate_vertical_by_pressure(da, p):
"""Integrates a variable along the vertical.
Integrates the variable in the vertical along the
pressure coordinate.
"""
dp = p.diff(dim="nhyi")
dp = dp.rename({"nhyi": "lev"})
dp["lev"] = da.lev
dda = da * dp
return dda.sum(dim="lev")
def compute_qvi(ds):
"""Integrates"""
pi = pressure(ds.hyai, ds.hybi, ds.PS) # pressure at level interfaces
qvi = integrate_vertical_by_pressure(ds.QD, pi)
# make meta data consistent...
qvi.name = "QVI"
qvi.attrs = ds.QD.attrs
result = qvi.assign_attrs(
units="kg/m/s",
standard_name="integrated_specific_humidity",
long_name="Integrated specific humidity",
).to_dataset()
result["rotated_pole"] = ds.rotated_pole
return result
[14]:
qvi = compute_qvi(tfile)
plot_contourf(qvi.QVI.isel(time=0), pole)
Use Timeseries from REMO output
NOTE: Only possible with access to Mistral!
We will use 3D data from one month of REMO output here.
[16]:
import os
import tarfile
tar = tarfile.open("/work/ch0636/g300046/remo_results_056524/2005/e056524t200509.tar")
scratch = os.path.join(os.getenv("SCRATCH"), "tar-tmp")
# tar.extractall(scratch) comment out if already extracted...
[ ]:
# optional, use dask client for parallel processing
from dask.distributed import Client
client = Client()
client
Here, we need some code to convert classical REMO IEG output into NetCDF format. We use a simple dask delayed approach to convert the files in parallel. Note, that in the future, REMO will produce native NetCDF output which we can process without any further postprocessing.
[17]:
import subprocess
# from cdo import Cdo
# python cdo bindings don't work with dask because of
# incorrect signal handling...
from glob import glob
filenames = glob(os.path.join(scratch, "*"))
# cdo = Cdo(tempdir=os.path.join(scratch, 'nc'))
def cdo_call(options="", op="", input="", output=None):
call = f"cdo {options} {op} {input} {output}"
subprocess.Popen(call, shell=True, stdout=subprocess.PIPE).stdout.read()
return output
# return subprocess.run("cdo {} {} {} {}".format(options, op, input, output), shell=True)
def convert_with_cdo(f):
"""Convert a single file into NetCDF format."""
file = os.path.basename(f)
path = os.path.dirname(f)
# cdo = Cdo()
# return cdo.copy(options='-f nc', input=f, output=os.path.join(path,'nc',file+'.nc'))
return cdo_call(
options="-f nc",
op="copy",
input=f,
output=os.path.join(path, "nc", file + ".nc"),
)
def convert_files(files, with_dask=False):
"""Convert many files into NetCDF format."""
results = []
if with_dask:
from dask import delayed
else:
def delayed(x):
return x
for f in files:
if os.path.isfile(f):
nc = delayed(convert_with_cdo)(f)
results.append(nc)
return results
[18]:
import dask
from dask.distributed import progress
output = convert_files(filenames, with_dask=True)
nc_files = dask.persist(output)
progress(nc_files)
[19]:
nc_files_ = dask.compute(nc_files)[0][0]
[23]:
nc_files_ = glob(os.path.join(scratch, "nc", "*.nc"))
nc_files_.sort()
Access NetCDF files
We use an optimized function here for opening large datasets that consist of many files:
[24]:
def open_mfdataset(
files,
use_cftime=True,
parallel=True,
data_vars="minimal",
chunks={"time": 1},
coords="minimal",
compat="override",
drop=None,
**kwargs,
):
"""optimized function for opening large cf datasets.
based on https://github.com/pydata/xarray/issues/1385#issuecomment-561920115
"""
import xarray as xr
def drop_all_coords(ds):
# ds = ds.drop(drop)
return ds.reset_coords(drop=True)
ds = xr.open_mfdataset(
files,
parallel=parallel,
decode_times=False,
combine="by_coords",
preprocess=drop_all_coords,
decode_cf=False,
chunks=chunks,
data_vars=data_vars,
coords="minimal",
compat="override",
**kwargs,
)
return xr.decode_cf(ds, use_cftime=use_cftime)
[25]:
nc_files_.sort()
%time remo_data = open_mfdataset(nc_files_)
CPU times: user 13.8 s, sys: 556 ms, total: 14.3 s
Wall time: 19.7 s
The old converted NetCDF files miss some meta data and a consistent calendar which we correct here.
[26]:
from pyremo import remo_ds as rds
remo_data = rds.update_meta_info(remo_data)
[27]:
remo_data = rds.parse_dates(remo_data, use_cftime=True)
Process the timeseries
Now, we can process the xarray dataset timeseries. Since we only use native xarray functions, they will be vectorized along the time axis automatically. To integrated the specific humidity for a certain time range, we can just call:
[28]:
subset = remo_data.sel(time=slice("2005-09-12T00:00:00", "2005-09-14T23:00:00"))
[29]:
qvi = compute_qvi(subset)
[30]:
qvi_ = qvi.persist()
progress(qvi_)
[31]:
qvi_ = qvi_.compute()
[32]:
plot_contourf(
qvi_.QVI.sel(time="2005-09-14T17:00:00").squeeze(),
pole,
vmin=0,
vmax=500,
levels=11,
)
Show Atmospheric River Event, Sept. 2005, Norway
Rests of Hurricanes Maria and Nate, (Kristin), hit Norwegian coast on 14 sept. 2005.
[57]:
import cartopy.crs as ccrs
import cmaps
p = qvi_.QVI.sel(
time=slice("2005-09-12T00:00:00", "2005-09-14T23:00:00", 6)
).plot.contourf(
col="time",
col_wrap=4,
cmap=cmaps.WhiteBlueGreenYellowRed,
levels=11,
vmin=0,
vmax=500,
figsize=(15, 10),
subplot_kws={"projection": ccrs.RotatedPole(*pole)},
)
for ax in p.axes.flat:
ax.coastlines()
ax.gridlines()