8. xarray Plotting (I): Plotting 1-Dimensional Data

8. xarray Plotting (I): Plotting 1-Dimensional Data#

Visualizing data with clear and aesthetic figures is crucial for demonstrating results. xarray provides convenient methods to plot DataArray and Dataset objects directly. Below are various plotting options for different types of data:

Single Variable Data (xarray.DataArray.plot):

Multi-Variable Data (xarray.Dataset.plot):

All xarray plotting methods support additional Matplotlib keyword arguments (**kwargs) for further customization.

Time Series Plot#

Plot using xarray.DataArray.plot.line. The xarray website provides clear instructions, but we will cover some advanced usage here.

Fill-Between#

Example 1: Calculate the pentad rainfall time series over the box area of Taiwan-North South China Sea (18˚-24˚N, 116˚-126˚E). Use a gray shaded area to represent the climatological maxima and minima, and use a thick black line to represent the climatological median.

We first read data, convert daily data to pentad, and calculate the maxima, minima, and median over 1998-2020.

import xarray as xr 
from matplotlib import pyplot as plt
import pandas as pd 
import matplotlib as mpl
from matplotlib import ticker

mpl.rcParams['figure.dpi'] = 100

lats, latn = 18, 24                   
lon1, lon2 = 116,126         
ys = 1998    # Start year
ye = 2020    # End year

plt.rcParams.update({'font.size': 14})

def daily_to_pentad(data): 
    # To convert daily data to pentad. See Unit 5 for detail. 
    ptd = xr.DataArray(dims=['year','pentad'], 
                       coords=dict(year=range(ys,ye+1,1), 
                                   pentad=range(1,74,1),
                       name='precip')
    )
    for yy in ptd.year: 
        for p in ptd.pentad: 
            ptd.loc[yy,p] = data[ int((yy-1998)*365+ (p-1)*5) : int((yy-1998)*365+ (p-1)*5 + 4)  ].sum()
    return(ptd)
    
pcp_ds = xr.open_dataset('data/cmorph_sample.nc')
pcp = pcp_ds.sel(time=slice('1998-01-01','2020-12-31'),lat=slice(lats,latn),lon=slice(lon1,lon2)).cmorph 
    
pcp_ptd = daily_to_pentad(pcp.mean(axis=(1,2))) 

pcp_max = pcp_ptd.max(axis=0)
pcp_min = pcp_ptd.min(axis=0)
pcp_med = pcp_ptd.median(axis=0)

Using plot.line(), we can plot three lines: pcp_max, pcp_med, and pcp_min. To add a shaded area between the maxima and minima, we use matplotlib.axes.Axes.fill_between:

Axes.fill_between(x, y1, y2, **kwargs)

Here, we specify the values of x, and the shaded area between y1 and y2 (or pcp_max and pcp_min).

fig, ax = plt.subplots(1,1,figsize=(10,5))

maxl = pcp_max.plot.line(color='black',linestyle='--',
                         x='pentad', xlim=(1,73), ylim=(0,200),
                         ax=ax)
minl = pcp_min.plot.line(color='black',linestyle='--',
                         x='pentad', xlim=(1,73), ylim=(0,200),
                         ax=ax)
medl = pcp_med.plot.line(color='black',linestyle='-', linewidth=3,
                         x='pentad', xlim=(1,73), ylim=(0,200),
                         ax=ax)

ax.fill_between(pcp_ptd.pentad, pcp_max, pcp_min, color='silver')

ax.set_xticks(range(1,75,5))
ax.set_xticks(range(1,73),minor=True)
ax.set_xlabel('Pentad') 
ax.set_ylabel('Pentad accumulative rainfall (mm)')
ax.set_title(" ")

plt.subplots_adjust(left=0.1,bottom=0.15,top=0.8,right=0.9)

plt.show()
_images/b86d959e9050471085d20059245c8473555a23bb3cadfa0f04fddba06848f209.png

Exercise

Add a blue thick curve to represent the data in any year of your interest.

Twin Axes#

Sometimes we need to plot two variables on a single plot. However, if the magnitudes of these two variables are very different, we use twin axes to set two different y (or x) axes. Below is an example from Hung and Yanai (2004) on the Australian summer monsoon.

Example 2: Plot the Time Series of Australian Summer Monsoon Indices. The indices from Hung and Yanai (2004, QJRMS) use 850-hPa zonal wind and OLR to estimate the monsoon onset. Since the magnitudes of wind and OLR are very different, we can plot them on the same figure using twin axes.

Step 1: Read the 850-hPa zonal wind and OLR data and select the spatial and temporal ranges.

import xarray as xr
import pandas as pd
from matplotlib import pyplot as plt

plt.rcParams.update({'font.size': 16})

# Set latitude, longitude parameters. 
lats, latn = -15, -2
lon1, lon2 = 110, 150
plev = 850

# Read data
uds = xr.open_mfdataset( ['data/ncep_r2_uv850/u850.2019.nc', 'data/ncep_r2_uv850/u850.2020.nc'], 
                           combine = "nested",               
                           concat_dim='time',          
                           parallel=True              
                         )
u = uds.sel(time=slice('2019-11-01','2020-03-31'), 
            level=850,
            lat=slice(latn,lats),
            lon=slice(lon1,lon2)).uwnd

olr = (xr.open_dataset("data/olr.nc")
         .sel(time=slice('2019-11-01','2020-03-31'),
                    lat=slice(lats,latn),
                    lon=slice(lon1,lon2)).olr)

Step 2: Analysis and computation: average wind and OLR indices over 2˚-15˚S, 110˚-130˚E according to Hung and Yanai (2004) paper.

# Spatial averages.
u_yh04   =   u.sel(lat=slice(-2,-15), lon=slice(115,150)).mean(axis=(1,2))
olr_yh04 = olr.sel(lat=slice(-15,-2), lon=slice(115,150)).mean(axis=(1,2))

Step 3: Plot the two indices. Because the magnitudes of OLR and wind are different, we need to create two plot axes, where ax1 is used to plot 850-hPa zonal wind and ax2 is used for OLR. Both ax1 and ax2 will share a common x-axis. We will display the wind field based on the left y-axis and OLR on the right y-axis. The common x-axis can be set using the matplotlib.axes.Axes.twinx() method.

fig, ax1 = plt.subplots(figsize=(16,7))    
# Duplicate the axes with a different y axis and the same x axis
ax2 = ax1.twinx()  

# Plot u850 on ax1
u_yh04_plot = u_yh04.plot.line(u_yh04, color='blue',x='time', ylim=(-10,16) ,ax=ax1)
# Plot OLR on ax2
olr_plot = olr_yh04.plot.line(olr_yh04,color='red',x='time', ylim=(160,270),ax=ax2)

# Set the axis name and color. 
ax1.set_ylabel("850-hPa U", color='blue')
ax2.set_ylabel("OLR"      , color='red'  )
ax1.set_xlabel("Time") # ax2 has no property control over x axis

# Set y ticks
ax1.set_yticks(range(-10,18,2))
ax2.set_yticks(range(160,280,10))
ax1.tick_params(axis='y', colors='blue')
ax2.tick_params(axis='y', colors='red'  )
ax2.yaxis.label.set_color('red')

# set x ticks
xticks = pd.to_datetime(['2019-11-01','2019-11-11','2019-11-21','2019-12-01','2019-12-11','2019-12-21',
                         '2020-01-01','2020-01-11','2020-01-21','2020-02-02','2020-02-11','2020-02-21',
                         '2020-03-01','2020-03-11','2020-03-21','2020-03-31'])
ax1.set_xticks(xticks)
ax1.set_xlim(xticks[0],xticks[-1])
ax1.set_title('Australian Summer Monsoon Indices')

# u_yh04 = 2 m/s is the threshold of monsoon onset. 
ax1.axhline(y=2,linewidth=1,color='gray')

plt.subplots_adjust(bottom=0.25)

plt.show()
_images/4935797f1074e04ce4e7b5c6ce94e88415cbb9656f5b9bac812b221631e3d183.png