OpenDAP via xpublish#

Let’s first start out with exploring how to convert an xarray dataset into an opendap_protocol dataset.

xarray -> opendap_protocol#

import numpy as np
import opendap_protocol as dap
import xarray as xr
ds = xr.open_dataset("../datasets/ww3_72_east_coast_2022041112.nc")
ds
<xarray.Dataset>
Dimensions:                  (longitude: 381, latitude: 351, time: 73,
                              forecast_reference_time: 1)
Coordinates:
  * longitude                (longitude) float32 -93.0 -92.9 ... -55.1 -55.0
  * latitude                 (latitude) float32 20.0 20.1 20.2 ... 54.9 55.0
  * time                     (time) datetime64[ns] 2022-04-11T12:00:00 ... 20...
  * forecast_reference_time  (forecast_reference_time) datetime64[ns] 2022-04...
Data variables: (12/21)
    hs                       (forecast_reference_time, time, latitude, longitude) float32 ...
    t02                      (forecast_reference_time, time, latitude, longitude) float32 ...
    t0m1                     (forecast_reference_time, time, latitude, longitude) float32 ...
    t01                      (forecast_reference_time, time, latitude, longitude) float32 ...
    fp                       (forecast_reference_time, time, latitude, longitude) float32 ...
    dir                      (forecast_reference_time, time, latitude, longitude) float32 ...
    ...                       ...
    pdir2                    (forecast_reference_time, time, latitude, longitude) float32 ...
    pws0                     (forecast_reference_time, time, latitude, longitude) float32 ...
    pws1                     (forecast_reference_time, time, latitude, longitude) float32 ...
    pws2                     (forecast_reference_time, time, latitude, longitude) float32 ...
    tws                      (forecast_reference_time, time, latitude, longitude) float32 ...
    pnr                      (forecast_reference_time, time, latitude, longitude) float32 ...
Attributes: (12/16)
    WAVEWATCH_III_version_number:     5.16
    WAVEWATCH_III_switches:           F90 SHRD NOGRB NOPA LRB4 NC4 TRKNC PR3 ...
    SIN4 namelist parameter BETAMAX:  1.65
    product_name:                     ww3.20220411.nc
    area:                             NW Atlantic  6 arc min grd2
    latitude_resolution:              0.1000000
    ...                               ...
    easternmost_longitude:            -55.00000
    minimum_altitude:                 -12000 m
    maximum_altitude:                 9000 m
    altitude_resolution:              n/a
    start_date:                       2022-04-11 12:00:00
    stop_date:                        2022-04-11 23:00:00

Dimensions first#

First we need to convert our dimensions into opendap_protocol.Arrays.

longitude = ds['longitude']
longitude
<xarray.DataArray 'longitude' (longitude: 381)>
array([-93. , -92.9, -92.8, ..., -55.2, -55.1, -55. ], dtype=float32)
Coordinates:
  * longitude  (longitude) float32 -93.0 -92.9 -92.8 -92.7 ... -55.2 -55.1 -55.0
Attributes:
    units:          degree_east
    long_name:      longitude
    standard_name:  longitude
    valid_min:      -180.0
    valid_max:      360.0
    axis:           X
longitude.name
'longitude'
longitude.dtype
dtype('float32')
longitude.dtype
dtype('float32')
dtype_dap = {
    np.ubyte: dap.Byte,
    np.int16: dap.Int16,
    np.uint16: dap.UInt16,
    np.int32: dap.Int32,
    np.uint32: dap.UInt32,
    np.float32: dap.Float32,
    np.float64: dap.Float64,
    np.str_: dap.String
}
dtype_dap = {np.dtype(k): v for k, v in dtype_dap.items()}

def dtype_to_daptype(dtype: np.dtype) -> dap.DAPAtom:
    try:
        return dtype_dap[dtype]
    except KeyError:
        print(f"Coercing dtype to string for {dtype}. Probably not a good idea.")
        return dap.String
    
# dtype_dap[longitude.dtype]
dtype_to_daptype(longitude.dtype)
opendap_protocol.protocol.Float32

I probably need to dig into opendap further to figure out the right coercions for datetimes.

dtype_to_daptype(ds['time'].dtype)
Coercing dtype to string for datetime64[ns]. Probably not a good idea.
opendap_protocol.protocol.String
dtype_dap
{dtype('uint8'): opendap_protocol.protocol.Byte,
 dtype('int16'): opendap_protocol.protocol.Int16,
 dtype('uint16'): opendap_protocol.protocol.UInt16,
 dtype('int32'): opendap_protocol.protocol.Int32,
 dtype('uint32'): opendap_protocol.protocol.UInt32,
 dtype('float32'): opendap_protocol.protocol.Float32,
 dtype('float64'): opendap_protocol.protocol.Float64,
 dtype('<U'): opendap_protocol.protocol.String}
def dap_dimension(da: xr.DataArray) -> dap.Array:
    return dap.Array(
        name=da.name,
        data=da.values,
        dtype=dtype_to_daptype(da.dtype)
    )

dap_dimension(longitude)
<opendap_protocol.protocol.Array at 0x18e457550>

Lets make a dictionary of our dimensions so that we can find them later.

dims = {}
for dim in ds.dims:
    try:
        dims[dim] = dap_dimension(ds[dim])
    except KeyError as e:
        raise KeyError(f"Problem with dim {dim}: {ds[dim]}") from e
dims
Coercing dtype to string for datetime64[ns]. Probably not a good idea.
Coercing dtype to string for datetime64[ns]. Probably not a good idea.
{'longitude': <opendap_protocol.protocol.Array at 0x18e4623b0>,
 'latitude': <opendap_protocol.protocol.Array at 0x18e462350>,
 'time': <opendap_protocol.protocol.Array at 0x18d77f610>,
 'forecast_reference_time': <opendap_protocol.protocol.Array at 0x18d77e020>}

Next, the variables#

hs = ds['hs']
hs
<xarray.DataArray 'hs' (forecast_reference_time: 1, time: 73, latitude: 351,
                        longitude: 381)>
[9762363 values with dtype=float32]
Coordinates:
  * longitude                (longitude) float32 -93.0 -92.9 ... -55.1 -55.0
  * latitude                 (latitude) float32 20.0 20.1 20.2 ... 54.9 55.0
  * time                     (time) datetime64[ns] 2022-04-11T12:00:00 ... 20...
  * forecast_reference_time  (forecast_reference_time) datetime64[ns] 2022-04...
Attributes:
    long_name:      significant height of wind and swell waves
    standard_name:  sea_surface_wave_significant_height
    globwave_name:  significant_wave_height
    units:          m
    valid_min:      0
    valid_max:      32000
hs.shape
(1, 73, 351, 381)
hs.dtype
dtype('float32')
hs.dims
('forecast_reference_time', 'time', 'latitude', 'longitude')
hs_dap = dap.Grid(
    name=hs.name,
    data=hs.data,
    dtype=dtype_to_daptype(hs.dtype),
    dimensions=[
        dims[dim] for dim in hs.dims
    ]
)
hs_dap
<opendap_protocol.protocol.Grid at 0x18f5a51e0>
hs_attrs = [dap.Attribute(name=k, value=v, dtype=dap.String) for k, v in hs.attrs.items()]
hs_attrs
[<opendap_protocol.protocol.Attribute at 0x18f5c8d00>,
 <opendap_protocol.protocol.Attribute at 0x18ea91450>,
 <opendap_protocol.protocol.Attribute at 0x18ea935e0>,
 <opendap_protocol.protocol.Attribute at 0x18ea91540>,
 <opendap_protocol.protocol.Attribute at 0x18ea935b0>,
 <opendap_protocol.protocol.Attribute at 0x18ea91840>]
hs_dap.append(*hs_attrs)
hs_dap
<opendap_protocol.protocol.Grid at 0x18f5a51e0>

Now for a dataset#

dataset = dap.Dataset(name="ww3")
dataset.append(*dims.values(), hs_dap)
dataset
<opendap_protocol.protocol.Dataset at 0x18a2ad870>
print(''.join(dataset.dds()))
Dataset {
    Float32 longitude[longitude = 381];
    Float32 latitude[latitude = 351];
    String time[time = 73];
    String forecast_reference_time[forecast_reference_time = 1];
    Grid {
      Array:
        Float32 hs[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } hs;
} ww3;

How about a more generalized dtype conversion function. Then in the future we can inspect the dataset further to find out the right conversion.

def dap_dtype(da: xr.DataArray):
    try:
        return dtype_dap[da.dtype]
    except KeyError as e:
        print(f"Unable to match dtype for {da.name}. Going to assume string will work for now...")
        return dap.String

dap_dtype(hs)
opendap_protocol.protocol.Float32

And lets bundle those all up nicer into functions#

def dap_dimension(da: xr.DataArray) -> dap.Array:
    dim = dap.Array(
        name=da.name,
        data=da.values,
        dtype=dap_dtype(da)
    )
    
    for k, v in da.attrs.items():
        dim.append(dap.Attribute(name=k, value=v, dtype=dap.String))
    
    return dim

dap_dimension(ds['longitude'])
dap_dimension(ds['time'])
Unable to match dtype for time. Going to assume string will work for now...
<opendap_protocol.protocol.Array at 0x1923d9db0>
def dap_array(da: xr.DataArray, dims: dict[str, dap.Array]) -> dap.Grid:
    data_array = dap.Grid(
        name=da.name,
        data=da.data,
        dtype=dap_dtype(da),
        dimensions=[
            dims[dim] for dim in da.dims
        ]
    )
    
    for k, v in da.attrs.items():
        data_array.append(dap.Attribute(name=k, value=v, dtype=dap.String))
    
    return data_array

dap_array(ds['hs'], dims)
<opendap_protocol.protocol.Grid at 0x192370b20>
def dap_dataset(ds: xr.Dataset, name: str) -> dap.Dataset:
    dataset = dap.Dataset(name=name)
    
    dims = {}
    for dim in ds.dims:
        dims[dim] = dap_dimension(ds[dim])
            
    dataset.append(*dims.values())
    
    for var in ds.variables:
        if var not in ds.dims:
            data_array = dap_array(ds[var], dims)
            dataset.append(data_array)
    
    for k, v in ds.attrs.items():
        dataset.append(dap.Attribute(name=k, value=v, dtype=dap.String))
    
    return dataset

ww3_dap = dap_dataset(ds, "ww3")
ww3_dap
Unable to match dtype for time. Going to assume string will work for now...
Unable to match dtype for forecast_reference_time. Going to assume string will work for now...
<opendap_protocol.protocol.Dataset at 0x1924c5a80>
print("".join(ww3_dap.dds()))
Dataset {
    Float32 longitude[longitude = 381];
    Float32 latitude[latitude = 351];
    String time[time = 73];
    String forecast_reference_time[forecast_reference_time = 1];
    Grid {
      Array:
        Float32 hs[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } hs;
    Grid {
      Array:
        Float32 t02[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } t02;
    Grid {
      Array:
        Float32 t0m1[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } t0m1;
    Grid {
      Array:
        Float32 t01[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } t01;
    Grid {
      Array:
        Float32 fp[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } fp;
    Grid {
      Array:
        Float32 dir[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } dir;
    Grid {
      Array:
        Float32 dp[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } dp;
    Grid {
      Array:
        Float32 phs0[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } phs0;
    Grid {
      Array:
        Float32 phs1[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } phs1;
    Grid {
      Array:
        Float32 phs2[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } phs2;
    Grid {
      Array:
        Float32 ptp0[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } ptp0;
    Grid {
      Array:
        Float32 ptp1[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } ptp1;
    Grid {
      Array:
        Float32 ptp2[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } ptp2;
    Grid {
      Array:
        Float32 pdir0[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } pdir0;
    Grid {
      Array:
        Float32 pdir1[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } pdir1;
    Grid {
      Array:
        Float32 pdir2[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } pdir2;
    Grid {
      Array:
        Float32 pws0[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } pws0;
    Grid {
      Array:
        Float32 pws1[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } pws1;
    Grid {
      Array:
        Float32 pws2[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } pws2;
    Grid {
      Array:
        Float32 tws[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } tws;
    Grid {
      Array:
        Float32 pnr[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
      Maps:
        String forecast_reference_time[forecast_reference_time = 1];
        String time[time = 73];
        Float32 latitude[latitude = 351];
        Float32 longitude[longitude = 381];
    } pnr;
} ww3;
print("".join(ww3_dap.das()))
Attributes {
    longitude {
        String units "degree_east";
        String long_name "longitude";
        String standard_name "longitude";
        String valid_min "-180.0";
        String valid_max "360.0";
        String axis "X";
    }
    latitude {
        String units "degree_north";
        String long_name "latitude";
        String standard_name "latitude";
        String valid_min "-90.0";
        String valid_max "180.0";
        String axis "Y";
    }
    time {
        String long_name "julian day (UT)";
        String standard_name "time";
        String conventions "relative julian days with decimal part (as parts of the day )";
        String axis "T";
    }
    forecast_reference_time {
        String long_name "Forecast Reference Time";
        String standard_name "forecast_reference_time";
        String conventions "relative julian days with decimal part (as parts of the day )";
    }
    hs {
        String long_name "significant height of wind and swell waves";
        String standard_name "sea_surface_wave_significant_height";
        String globwave_name "significant_wave_height";
        String units "m";
        String valid_min "0";
        String valid_max "32000";
    }
    t02 {
        String long_name "mean period T02";
        String standard_name "sea_surface_wind_wave_mean_period_from_variance_spectral_density_second_frequency_moment";
        String globwave_name "mean_period_t02";
        String units "s";
        String valid_min "0";
        String valid_max "5000";
    }
    t0m1 {
        String long_name "mean period T0m1";
        String standard_name "sea_surface_wind_wave_mean_period_from_variance_spectral_density_inverse_frequency_moment";
        String globwave_name "mean_period_t0m1";
        String units "s";
        String valid_min "0";
        String valid_max "5000";
    }
    t01 {
        String long_name "mean period T01";
        String standard_name "sea_surface_wind_wave_mean_period_from_variance_spectral_density_first_frequency_moment";
        String globwave_name "mean_period_t01";
        String units "s";
        String valid_min "0";
        String valid_max "5000";
    }
    fp {
        String long_name "wave peak frequency";
        String standard_name "sea_surface_wave_peak_frequency";
        String globwave_name "dominant_wave_frequency";
        String units "s-1";
        String valid_min "0";
        String valid_max "10000";
    }
    dir {
        String long_name "wave mean direction";
        String standard_name "sea_surface_wave_from_direction";
        String globwave_name "wave_from_direction";
        String units "degree";
        String valid_min "0";
        String valid_max "3600";
    }
    dp {
        String long_name "peak direction";
        String standard_name "sea_surface_wave_peak_direction";
        String globwave_name "dominant_wave_direction";
        String units "degree";
        String valid_min "0";
        String valid_max "360";
    }
    phs0 {
        String long_name "wave significant height partition 0";
        String standard_name "sea_surface_wave_significant_height_partition_0";
        String globwave_name "significant_wave_height_partition_0";
        String units "m";
        String valid_min "0";
        String valid_max "32000";
    }
    phs1 {
        String long_name "wave significant height partition 1";
        String standard_name "sea_surface_wave_significant_height_partition_1";
        String globwave_name "significant_wave_height_partition_1";
        String units "m";
        String valid_min "0";
        String valid_max "32000";
    }
    phs2 {
        String long_name "wave significant height partition 2";
        String standard_name "sea_surface_wave_significant_height_partition_2";
        String globwave_name "significant_wave_height_partition_2";
        String units "m";
        String valid_min "0";
        String valid_max "32000";
    }
    ptp0 {
        String long_name "peak period partition 0";
        String standard_name "sea_surface_wave_peak_period_partition_0";
        String globwave_name "dominant_wave_period_partition_0";
        String units "s";
        String valid_min "0";
        String valid_max "10000";
    }
    ptp1 {
        String long_name "peak period partition 1";
        String standard_name "sea_surface_wave_peak_period_partition_1";
        String globwave_name "dominant_wave_period_partition_1";
        String units "s";
        String valid_min "0";
        String valid_max "10000";
    }
    ptp2 {
        String long_name "peak period partition 2";
        String standard_name "sea_surface_wave_peak_period_partition_2";
        String globwave_name "dominant_wave_period_partition_2";
        String units "s";
        String valid_min "0";
        String valid_max "10000";
    }
    pdir0 {
        String long_name "wave mean direction partition 0";
        String standard_name "sea_surface_wave_from_direction_partition_0";
        String globwave_name "wave_from_direction_partition_0";
        String units "degree";
        String valid_min "0";
        String valid_max "360";
    }
    pdir1 {
        String long_name "wave mean direction partition 1";
        String standard_name "sea_surface_wave_from_direction_partition_1";
        String globwave_name "wave_from_direction_partition_1";
        String units "degree";
        String valid_min "0";
        String valid_max "360";
    }
    pdir2 {
        String long_name "wave mean direction partition 2";
        String standard_name "sea_surface_wave_from_direction_partition_2";
        String globwave_name "wave_from_direction_partition_2";
        String units "degree";
        String valid_min "0";
        String valid_max "360";
    }
    pws0 {
        String long_name "wind sea fraction in partition 0";
        String standard_name "wind_sea_fraction_in_partition_0";
        String globwave_name "wind_sea_fraction_in_partition_0";
        String units "1";
        String valid_min "0";
        String valid_max "1000";
    }
    pws1 {
        String long_name "wind sea fraction in partition 1";
        String standard_name "wind_sea_fraction_in_partition_1";
        String globwave_name "wind_sea_fraction_in_partition_1";
        String units "1";
        String valid_min "0";
        String valid_max "1000";
    }
    pws2 {
        String long_name "wind sea fraction in partition 2";
        String standard_name "wind_sea_fraction_in_partition_2";
        String globwave_name "wind_sea_fraction_in_partition_2";
        String units "1";
        String valid_min "0";
        String valid_max "1000";
    }
    tws {
        String long_name "wind sea fraction";
        String standard_name "wind_sea_fraction";
        String globwave_name "wind_sea_fraction";
        String units "1";
        String valid_min "0";
        String valid_max "1000";
    }
    pnr {
        String long_name "number of wave partitions";
        String standard_name "number_of_wave_partitions";
        String globwave_name "number_of_wave_partitions";
        String units "1";
        String valid_min "0";
        String valid_max "100";
    }
    String WAVEWATCH_III_version_number "5.16";
    String WAVEWATCH_III_switches "F90 SHRD NOGRB NOPA LRB4 NC4 TRKNC PR3 UQ FLX0 LN1 ST4 STAB0 NL1 BT1 DB1 MLIM IC0 IS0 REF0 WNT2 WNX2 RWND CRT1 CRX1 TR0 BS0 XX0 O0 O1 O2 O3 O4 O5 O6 O7";
    String SIN4_namelist_parameter_BETAMAX "1.649999976158142";
    String product_name "ww3.20220411.nc";
    String area "NW Atlantic  6 arc min grd2";
    String latitude_resolution "0.1000000";
    String longitude_resolution "0.1000000";
    String southernmost_latitude "20.00000";
    String northernmost_latitude "55.00000";
    String westernmost_longitude "-93.00000";
    String easternmost_longitude "-55.00000";
    String minimum_altitude "-12000 m";
    String maximum_altitude "9000 m";
    String altitude_resolution "n/a";
    String start_date "2022-04-11 12:00:00";
    String stop_date "2022-04-11 23:00:00";
}

This will break things, but probably for very good reasons (too much data in the stream for Jupyer to handle)!

print(b"".join(dataset.dods()))

That seemed to work. Lets move those into a router and give it a shot.

Loading with OpenDAP#

Now we can try loading the dataset via OpenDAP.

ds_from_dap = xr.open_dataset("http://0.0.0.0:9005/datasets/ww3/opendap")
ds_from_dap
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:106, in NetCDF4ArrayWrapper._getitem(self, key)
    105         original_array = self.get_array(needs_lock=False)
--> 106         array = getitem(original_array, key)
    107 except IndexError:
    108     # Catch IndexError in netCDF4 and return a more informative
    109     # error message.  This is most often called when an unsorted
    110     # indexer is used before the data is loaded from disk.

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/backends/common.py:64, in robust_getitem(array, key, catch, max_retries, initial_delay)
     63 try:
---> 64     return array[key]
     65 except catch:

File src/netCDF4/_netCDF4.pyx:4406, in netCDF4._netCDF4.Variable.__getitem__()

File src/netCDF4/_netCDF4.pyx:5348, in netCDF4._netCDF4.Variable._get()

IndexError: index exceeds dimension bounds

During handling of the above exception, another exception occurred:

IndexError                                Traceback (most recent call last)
Input In [78], in <cell line: 1>()
----> 1 ds_from_dap = xr.open_dataset("http://0.0.0.0:9005/datasets/ww3/opendap")
      2 ds_from_dap

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/backends/api.py:495, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, backend_kwargs, *args, **kwargs)
    483 decoders = _resolve_decoders_kwargs(
    484     decode_cf,
    485     open_backend_dataset_parameters=backend.open_dataset_parameters,
   (...)
    491     decode_coords=decode_coords,
    492 )
    494 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 495 backend_ds = backend.open_dataset(
    496     filename_or_obj,
    497     drop_variables=drop_variables,
    498     **decoders,
    499     **kwargs,
    500 )
    501 ds = _dataset_from_backend_dataset(
    502     backend_ds,
    503     filename_or_obj,
   (...)
    510     **kwargs,
    511 )
    512 return ds

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:567, in NetCDF4BackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, format, clobber, diskless, persist, lock, autoclose)
    565 store_entrypoint = StoreBackendEntrypoint()
    566 with close_on_error(store):
--> 567     ds = store_entrypoint.open_dataset(
    568         store,
    569         mask_and_scale=mask_and_scale,
    570         decode_times=decode_times,
    571         concat_characters=concat_characters,
    572         decode_coords=decode_coords,
    573         drop_variables=drop_variables,
    574         use_cftime=use_cftime,
    575         decode_timedelta=decode_timedelta,
    576     )
    577 return ds

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/backends/store.py:39, in StoreBackendEntrypoint.open_dataset(self, store, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta)
     25 encoding = store.get_encoding()
     27 vars, attrs, coord_names = conventions.decode_cf_variables(
     28     vars,
     29     attrs,
   (...)
     36     decode_timedelta=decode_timedelta,
     37 )
---> 39 ds = Dataset(vars, attrs=attrs)
     40 ds = ds.set_coords(coord_names.intersection(vars))
     41 ds.set_close(store.close)

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/dataset.py:750, in Dataset.__init__(self, data_vars, coords, attrs)
    747 if isinstance(coords, Dataset):
    748     coords = coords.variables
--> 750 variables, coord_names, dims, indexes, _ = merge_data_and_coords(
    751     data_vars, coords, compat="broadcast_equals"
    752 )
    754 self._attrs = dict(attrs) if attrs is not None else None
    755 self._close = None

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/merge.py:483, in merge_data_and_coords(data, coords, compat, join)
    481 explicit_coords = coords.keys()
    482 indexes = dict(_extract_indexes_from_coords(coords))
--> 483 return merge_core(
    484     objects, compat, join, explicit_coords=explicit_coords, indexes=indexes
    485 )

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/merge.py:632, in merge_core(objects, compat, join, combine_attrs, priority_arg, explicit_coords, indexes, fill_value)
    628 coerced = coerce_pandas_values(objects)
    629 aligned = deep_align(
    630     coerced, join=join, copy=False, indexes=indexes, fill_value=fill_value
    631 )
--> 632 collected = collect_variables_and_indexes(aligned)
    634 prioritized = _get_priority_vars_and_indexes(aligned, priority_arg, compat=compat)
    635 variables, out_indexes = merge_collected(
    636     collected, prioritized, compat=compat, combine_attrs=combine_attrs
    637 )

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/merge.py:291, in collect_variables_and_indexes(list_of_mappings)
    288     indexes.pop(name, None)
    289     append_all(coords, indexes)
--> 291 variable = as_variable(variable, name=name)
    293 if variable.dims == (name,):
    294     idx_variable = variable.to_index_variable()

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/variable.py:154, in as_variable(obj, name)
    148     if obj.ndim != 1:
    149         raise MissingDimensionsError(
    150             f"{name!r} has more than 1-dimension and the same name as one of its "
    151             f"dimensions {obj.dims!r}. xarray disallows such variables because they "
    152             "conflict with the coordinates used to label dimensions."
    153         )
--> 154     obj = obj.to_index_variable()
    156 return obj

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/variable.py:528, in Variable.to_index_variable(self)
    526 def to_index_variable(self):
    527     """Return this variable as an xarray.IndexVariable"""
--> 528     return IndexVariable(
    529         self.dims, self._data, self._attrs, encoding=self._encoding, fastpath=True
    530     )

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/variable.py:2672, in IndexVariable.__init__(self, dims, data, attrs, encoding, fastpath)
   2670 # Unlike in Variable, always eagerly load values into memory
   2671 if not isinstance(self._data, PandasIndexingAdapter):
-> 2672     self._data = PandasIndexingAdapter(self._data)

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/indexing.py:1272, in PandasIndexingAdapter.__init__(self, array, dtype)
   1271 def __init__(self, array: pd.Index, dtype: DTypeLike = None):
-> 1272     self.array = utils.safe_cast_to_index(array)
   1274     if dtype is None:
   1275         if isinstance(array, pd.PeriodIndex):

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/utils.py:115, in safe_cast_to_index(array)
    113     if hasattr(array, "dtype") and array.dtype.kind == "O":
    114         kwargs["dtype"] = object
--> 115     index = pd.Index(np.asarray(array), **kwargs)
    116 return _maybe_cast_to_cftimeindex(index)

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/indexing.py:423, in LazilyIndexedArray.__array__(self, dtype)
    421 def __array__(self, dtype=None):
    422     array = as_indexable(self.array)
--> 423     return np.asarray(array[self.key], dtype=None)

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:93, in NetCDF4ArrayWrapper.__getitem__(self, key)
     92 def __getitem__(self, key):
---> 93     return indexing.explicit_indexing_adapter(
     94         key, self.shape, indexing.IndexingSupport.OUTER, self._getitem
     95     )

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/indexing.py:712, in explicit_indexing_adapter(key, shape, indexing_support, raw_indexing_method)
    690 """Support explicit indexing by delegating to a raw indexing method.
    691 
    692 Outer and/or vectorized indexers are supported by indexing a second time
   (...)
    709 Indexing result, in the form of a duck numpy-array.
    710 """
    711 raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support)
--> 712 result = raw_indexing_method(raw_key.tuple)
    713 if numpy_indices.tuple:
    714     # index the loaded np.ndarray
    715     result = NumpyIndexingAdapter(np.asarray(result))[numpy_indices]

File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:116, in NetCDF4ArrayWrapper._getitem(self, key)
    107 except IndexError:
    108     # Catch IndexError in netCDF4 and return a more informative
    109     # error message.  This is most often called when an unsorted
    110     # indexer is used before the data is loaded from disk.
    111     msg = (
    112         "The indexing operation you are attempting to perform "
    113         "is not valid on netCDF4.Variable object. Try loading "
    114         "your data into memory first by calling .load()."
    115     )
--> 116     raise IndexError(msg)
    117 return array

IndexError: The indexing operation you are attempting to perform is not valid on netCDF4.Variable object. Try loading your data into memory first by calling .load().

It fails, but it identifies that an index is the issue. We probably don’t have the right datatypes on the datetime dimensions, so we at least have a place to start exploring.

forecast_time = ds["forecast_reference_time"]
forecast_time
<xarray.DataArray 'forecast_reference_time' (forecast_reference_time: 1)>
array(['2022-04-11T12:00:00.000000000'], dtype='datetime64[ns]')
Coordinates:
  * forecast_reference_time  (forecast_reference_time) datetime64[ns] 2022-04...
Attributes:
    long_name:      Forecast Reference Time
    standard_name:  forecast_reference_time
    conventions:    relative julian days with decimal part (as parts of the d...
forecast_time.dtype
dtype('<M8[ns]')
forecast_time.data
array(['2022-04-11T12:00:00.000000000'], dtype='datetime64[ns]')
forecast_time.data.dtype
dtype('<M8[ns]')
forecast_time.encoding
{'zlib': True,
 'shuffle': True,
 'complevel': 1,
 'fletcher32': False,
 'contiguous': False,
 'chunksizes': (512,),
 'source': '/Users/akerney/GMRI/restful-grids/datasets/ww3_72_east_coast_2022041112.nc',
 'original_shape': (1,),
 'dtype': dtype('float64'),
 '_FillValue': nan,
 'units': 'days since 1990-01-01',
 'calendar': 'proleptic_gregorian'}
forecast_time.encoding['dtype']
dtype('float64')
def dap_dtype(da: xr.DataArray):
    """ Return a DAP type for the xr.DataArray """
    try:
        return dtype_dap[da.encoding["dtype"]]
    except KeyError as e:
        logger.warning(
            f"Unable to match dtype for {da.name}. Going to assume string will work for now... ({e})"
        )
        return dap.String

Changed things to use the encoded dtype, and…

ds_from_dap = xr.open_dataset("http://0.0.0.0:9005/datasets/ww3/opendap")
ds_from_dap
<xarray.Dataset>
Dimensions:                  (longitude: 381, latitude: 351, time: 73,
                              forecast_reference_time: 1)
Coordinates:
  * longitude                (longitude) float32 -93.0 -92.9 ... -55.1 -55.0
  * latitude                 (latitude) float32 20.0 20.1 20.2 ... 54.9 55.0
  * time                     (time) float64 1.65e+18 1.65e+18 ... 1.65e+18
  * forecast_reference_time  (forecast_reference_time) float64 1.65e+18
Data variables: (12/21)
    hs                       (forecast_reference_time, time, latitude, longitude) float32 ...
    t02                      (forecast_reference_time, time, latitude, longitude) float32 ...
    t0m1                     (forecast_reference_time, time, latitude, longitude) float32 ...
    t01                      (forecast_reference_time, time, latitude, longitude) float32 ...
    fp                       (forecast_reference_time, time, latitude, longitude) float32 ...
    dir                      (forecast_reference_time, time, latitude, longitude) float32 ...
    ...                       ...
    pdir2                    (forecast_reference_time, time, latitude, longitude) float32 ...
    pws0                     (forecast_reference_time, time, latitude, longitude) float32 ...
    pws1                     (forecast_reference_time, time, latitude, longitude) float32 ...
    pws2                     (forecast_reference_time, time, latitude, longitude) float32 ...
    tws                      (forecast_reference_time, time, latitude, longitude) float32 ...
    pnr                      (forecast_reference_time, time, latitude, longitude) float32 ...
Attributes: (12/16)
    WAVEWATCH_III_version_number:     5.16
    WAVEWATCH_III_switches:           F90 SHRD NOGRB NOPA LRB4 NC4 TRKNC PR3 ...
    SIN4_namelist_parameter_BETAMAX:  1.649999976158142
    product_name:                     ww3.20220411.nc
    area:                             NW Atlantic  6 arc min grd2
    latitude_resolution:              0.1000000
    ...                               ...
    easternmost_longitude:            -55.00000
    minimum_altitude:                 -12000 m
    maximum_altitude:                 9000 m
    altitude_resolution:              n/a
    start_date:                       2022-04-11 12:00:00
    stop_date:                        2022-04-11 23:00:00

Holy crap, that worked.