# 4. Conditional Control of `datetime` Index

## Conditional Control of `datetime` Index in `xarray`

In Unit 3, we demonstrated how to use the `sel` method with `slice` to select data with a continuous temporal or spatial range. However, sometimes we need to select non-continuous time periods, such as certain months over several years. In these cases, it is not useful to select with `slice`. Therefore, we can use conditional control arguments to select data that meet the requirements we specify. Specifically, the time coordinate in xarray is a datetime object, which includes datetime attributes such as year, month, day, and so on. We can use these attributes to select the dates we like. 

**Example 1: Select only the JAS season data.** 

In [1]:
import xarray as xr 

olr_da = xr.open_dataset("data/olr.nc").olr
olr_jas = olr_da.sel(time=(olr_da.time.dt.month.isin([7,8,9]))) 
olr_jas

For `time=(olr_da.time.dt.month.isin([7, 8, 9]))`, xarray will check if the month of each timestep falls in either the 7th, 8th, or 9th month (i.e., July, August, or September). If so, the timestep will be marked `True`. Otherwise, it will be marked `False`. Finally, only the data points marked `True` will be preserved.

**Example 2: Remove Leap Days**

Similar to Example 1, we can use *reverse selection* to remove the leap days. This means selecting all dates that are not February 29th.


In [2]:
olr_noleap = olr_da.sel(time=~((olr_da.time.dt.month == 2) & (olr_da.time.dt.day == 29)))  # ~(): reversed selection
                                                                                           # not selecting 2/29
olr_noleap

## DatetimeIndex and Its Applications

Using `pandas`, we can easily create a datetime object. The `to_datetime` method can convert string with datetime format to a datetime object.  

In [3]:
import pandas as pd

pd.to_datetime(["2000-01-01", "2000-02-02"])

DatetimeIndex(['2000-01-01', '2000-02-02'], dtype='datetime64[ns]', freq=None)

We can also specify the start time and the length to create the time series. 

In [4]:
ts = pd.date_range("2000-01-01", periods=365)
ts

DatetimeIndex(['2000-01-01', '2000-01-02', '2000-01-03', '2000-01-04',
               '2000-01-05', '2000-01-06', '2000-01-07', '2000-01-08',
               '2000-01-09', '2000-01-10',
               ...
               '2000-12-21', '2000-12-22', '2000-12-23', '2000-12-24',
               '2000-12-25', '2000-12-26', '2000-12-27', '2000-12-28',
               '2000-12-29', '2000-12-30'],
              dtype='datetime64[ns]', length=365, freq='D')

Or specify the start and end time, and sampling frequency. 

In [5]:
ts = pd.date_range(start='2000-01-01',end='2000-12-30',freq='1D')
ts

DatetimeIndex(['2000-01-01', '2000-01-02', '2000-01-03', '2000-01-04',
               '2000-01-05', '2000-01-06', '2000-01-07', '2000-01-08',
               '2000-01-09', '2000-01-10',
               ...
               '2000-12-21', '2000-12-22', '2000-12-23', '2000-12-24',
               '2000-12-25', '2000-12-26', '2000-12-27', '2000-12-28',
               '2000-12-29', '2000-12-30'],
              dtype='datetime64[ns]', length=365, freq='D')

To convert the datetime with formatted strings, we can use `strftime` method. For example, we format the datetime to 'Jan 01 00' here: 

In [6]:
ts.strftime("%b %d %y")

Index(['Jan 01 00', 'Jan 02 00', 'Jan 03 00', 'Jan 04 00', 'Jan 05 00',
       'Jan 06 00', 'Jan 07 00', 'Jan 08 00', 'Jan 09 00', 'Jan 10 00',
       ...
       'Dec 21 00', 'Dec 22 00', 'Dec 23 00', 'Dec 24 00', 'Dec 25 00',
       'Dec 26 00', 'Dec 27 00', 'Dec 28 00', 'Dec 29 00', 'Dec 30 00'],
      dtype='object', length=365)

Note that this string index is no longer a datetime object. The **formatter** `%b` means to format months as abbreviated names, and `%y` means year without century as a zero-padded decimal number. Detailed usages of the formatters can be found in [Datetime: `strftime`-`strptime` Behavior](https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior).

Similarly, we can format the time coordinate of a DataArray into a string format:

In [7]:
olr_da.time.dt.strftime("%b %d %y")

Therefore, the **DatetimeAccessor** `xarray.DataArray.time.dt` is equivalent to a `pandas.DatetimeIndex`.

It is important to learn the `strftime` method because it will be applicable to formatting the time labels on time series plots or Hovmöller diagrams.

## `datetime` and `timedelta`

Datetime Accessor and `pandas.DatetimeIndex` actually belong to datetime objects.

> `datetime.datetime`: A combination of a date and a time. Attributes: year, month, day, hour, minute, second, microsecond, and tzinfo. [(`datetime` offical website)](https://docs.python.org/3/library/datetime.html)

We can also perform arithmetic calculations on datetime objects. For example, we can use the combination of `datetime.datetime` and `datetime.timedelta` to obtain a certain date.

> A timedelta object represents a duration, the difference between two dates or times. [(`datetime` offical website)](https://docs.python.org/3/library/datetime.html)

The following are some arithmetic rules for `datetime.datetime` and `datetime.timedelta`:

```
datetime2 = datetime1 + timedelta 
datetime2 = datetime1 - timedelta
timedelta = datetime1 - datetime2
datetime1 < datetime2 
```