OpenDAP via xpublish
Contents
OpenDAP via xpublish#
Let’s first start out with exploring how to convert an xarray dataset into an opendap_protocol dataset.
xarray -> opendap_protocol#
import numpy as np
import opendap_protocol as dap
import xarray as xr
ds = xr.open_dataset("../datasets/ww3_72_east_coast_2022041112.nc")
ds
<xarray.Dataset> Dimensions: (longitude: 381, latitude: 351, time: 73, forecast_reference_time: 1) Coordinates: * longitude (longitude) float32 -93.0 -92.9 ... -55.1 -55.0 * latitude (latitude) float32 20.0 20.1 20.2 ... 54.9 55.0 * time (time) datetime64[ns] 2022-04-11T12:00:00 ... 20... * forecast_reference_time (forecast_reference_time) datetime64[ns] 2022-04... Data variables: (12/21) hs (forecast_reference_time, time, latitude, longitude) float32 ... t02 (forecast_reference_time, time, latitude, longitude) float32 ... t0m1 (forecast_reference_time, time, latitude, longitude) float32 ... t01 (forecast_reference_time, time, latitude, longitude) float32 ... fp (forecast_reference_time, time, latitude, longitude) float32 ... dir (forecast_reference_time, time, latitude, longitude) float32 ... ... ... pdir2 (forecast_reference_time, time, latitude, longitude) float32 ... pws0 (forecast_reference_time, time, latitude, longitude) float32 ... pws1 (forecast_reference_time, time, latitude, longitude) float32 ... pws2 (forecast_reference_time, time, latitude, longitude) float32 ... tws (forecast_reference_time, time, latitude, longitude) float32 ... pnr (forecast_reference_time, time, latitude, longitude) float32 ... Attributes: (12/16) WAVEWATCH_III_version_number: 5.16 WAVEWATCH_III_switches: F90 SHRD NOGRB NOPA LRB4 NC4 TRKNC PR3 ... SIN4 namelist parameter BETAMAX: 1.65 product_name: ww3.20220411.nc area: NW Atlantic 6 arc min grd2 latitude_resolution: 0.1000000 ... ... easternmost_longitude: -55.00000 minimum_altitude: -12000 m maximum_altitude: 9000 m altitude_resolution: n/a start_date: 2022-04-11 12:00:00 stop_date: 2022-04-11 23:00:00
Dimensions first#
First we need to convert our dimensions into opendap_protocol.Array
s.
longitude = ds['longitude']
longitude
<xarray.DataArray 'longitude' (longitude: 381)> array([-93. , -92.9, -92.8, ..., -55.2, -55.1, -55. ], dtype=float32) Coordinates: * longitude (longitude) float32 -93.0 -92.9 -92.8 -92.7 ... -55.2 -55.1 -55.0 Attributes: units: degree_east long_name: longitude standard_name: longitude valid_min: -180.0 valid_max: 360.0 axis: X
longitude.name
'longitude'
longitude.dtype
dtype('float32')
longitude.dtype
dtype('float32')
dtype_dap = {
np.ubyte: dap.Byte,
np.int16: dap.Int16,
np.uint16: dap.UInt16,
np.int32: dap.Int32,
np.uint32: dap.UInt32,
np.float32: dap.Float32,
np.float64: dap.Float64,
np.str_: dap.String
}
dtype_dap = {np.dtype(k): v for k, v in dtype_dap.items()}
def dtype_to_daptype(dtype: np.dtype) -> dap.DAPAtom:
try:
return dtype_dap[dtype]
except KeyError:
print(f"Coercing dtype to string for {dtype}. Probably not a good idea.")
return dap.String
# dtype_dap[longitude.dtype]
dtype_to_daptype(longitude.dtype)
opendap_protocol.protocol.Float32
I probably need to dig into opendap further to figure out the right coercions for datetimes.
dtype_to_daptype(ds['time'].dtype)
Coercing dtype to string for datetime64[ns]. Probably not a good idea.
opendap_protocol.protocol.String
dtype_dap
{dtype('uint8'): opendap_protocol.protocol.Byte,
dtype('int16'): opendap_protocol.protocol.Int16,
dtype('uint16'): opendap_protocol.protocol.UInt16,
dtype('int32'): opendap_protocol.protocol.Int32,
dtype('uint32'): opendap_protocol.protocol.UInt32,
dtype('float32'): opendap_protocol.protocol.Float32,
dtype('float64'): opendap_protocol.protocol.Float64,
dtype('<U'): opendap_protocol.protocol.String}
def dap_dimension(da: xr.DataArray) -> dap.Array:
return dap.Array(
name=da.name,
data=da.values,
dtype=dtype_to_daptype(da.dtype)
)
dap_dimension(longitude)
<opendap_protocol.protocol.Array at 0x18e457550>
Lets make a dictionary of our dimensions so that we can find them later.
dims = {}
for dim in ds.dims:
try:
dims[dim] = dap_dimension(ds[dim])
except KeyError as e:
raise KeyError(f"Problem with dim {dim}: {ds[dim]}") from e
dims
Coercing dtype to string for datetime64[ns]. Probably not a good idea.
Coercing dtype to string for datetime64[ns]. Probably not a good idea.
{'longitude': <opendap_protocol.protocol.Array at 0x18e4623b0>,
'latitude': <opendap_protocol.protocol.Array at 0x18e462350>,
'time': <opendap_protocol.protocol.Array at 0x18d77f610>,
'forecast_reference_time': <opendap_protocol.protocol.Array at 0x18d77e020>}
Next, the variables#
hs = ds['hs']
hs
<xarray.DataArray 'hs' (forecast_reference_time: 1, time: 73, latitude: 351, longitude: 381)> [9762363 values with dtype=float32] Coordinates: * longitude (longitude) float32 -93.0 -92.9 ... -55.1 -55.0 * latitude (latitude) float32 20.0 20.1 20.2 ... 54.9 55.0 * time (time) datetime64[ns] 2022-04-11T12:00:00 ... 20... * forecast_reference_time (forecast_reference_time) datetime64[ns] 2022-04... Attributes: long_name: significant height of wind and swell waves standard_name: sea_surface_wave_significant_height globwave_name: significant_wave_height units: m valid_min: 0 valid_max: 32000
hs.shape
(1, 73, 351, 381)
hs.dtype
dtype('float32')
hs.dims
('forecast_reference_time', 'time', 'latitude', 'longitude')
hs_dap = dap.Grid(
name=hs.name,
data=hs.data,
dtype=dtype_to_daptype(hs.dtype),
dimensions=[
dims[dim] for dim in hs.dims
]
)
hs_dap
<opendap_protocol.protocol.Grid at 0x18f5a51e0>
hs_attrs = [dap.Attribute(name=k, value=v, dtype=dap.String) for k, v in hs.attrs.items()]
hs_attrs
[<opendap_protocol.protocol.Attribute at 0x18f5c8d00>,
<opendap_protocol.protocol.Attribute at 0x18ea91450>,
<opendap_protocol.protocol.Attribute at 0x18ea935e0>,
<opendap_protocol.protocol.Attribute at 0x18ea91540>,
<opendap_protocol.protocol.Attribute at 0x18ea935b0>,
<opendap_protocol.protocol.Attribute at 0x18ea91840>]
hs_dap.append(*hs_attrs)
hs_dap
<opendap_protocol.protocol.Grid at 0x18f5a51e0>
Now for a dataset#
dataset = dap.Dataset(name="ww3")
dataset.append(*dims.values(), hs_dap)
dataset
<opendap_protocol.protocol.Dataset at 0x18a2ad870>
print(''.join(dataset.dds()))
Dataset {
Float32 longitude[longitude = 381];
Float32 latitude[latitude = 351];
String time[time = 73];
String forecast_reference_time[forecast_reference_time = 1];
Grid {
Array:
Float32 hs[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} hs;
} ww3;
How about a more generalized dtype conversion function. Then in the future we can inspect the dataset further to find out the right conversion.
def dap_dtype(da: xr.DataArray):
try:
return dtype_dap[da.dtype]
except KeyError as e:
print(f"Unable to match dtype for {da.name}. Going to assume string will work for now...")
return dap.String
dap_dtype(hs)
opendap_protocol.protocol.Float32
And lets bundle those all up nicer into functions#
def dap_dimension(da: xr.DataArray) -> dap.Array:
dim = dap.Array(
name=da.name,
data=da.values,
dtype=dap_dtype(da)
)
for k, v in da.attrs.items():
dim.append(dap.Attribute(name=k, value=v, dtype=dap.String))
return dim
dap_dimension(ds['longitude'])
dap_dimension(ds['time'])
Unable to match dtype for time. Going to assume string will work for now...
<opendap_protocol.protocol.Array at 0x1923d9db0>
def dap_array(da: xr.DataArray, dims: dict[str, dap.Array]) -> dap.Grid:
data_array = dap.Grid(
name=da.name,
data=da.data,
dtype=dap_dtype(da),
dimensions=[
dims[dim] for dim in da.dims
]
)
for k, v in da.attrs.items():
data_array.append(dap.Attribute(name=k, value=v, dtype=dap.String))
return data_array
dap_array(ds['hs'], dims)
<opendap_protocol.protocol.Grid at 0x192370b20>
def dap_dataset(ds: xr.Dataset, name: str) -> dap.Dataset:
dataset = dap.Dataset(name=name)
dims = {}
for dim in ds.dims:
dims[dim] = dap_dimension(ds[dim])
dataset.append(*dims.values())
for var in ds.variables:
if var not in ds.dims:
data_array = dap_array(ds[var], dims)
dataset.append(data_array)
for k, v in ds.attrs.items():
dataset.append(dap.Attribute(name=k, value=v, dtype=dap.String))
return dataset
ww3_dap = dap_dataset(ds, "ww3")
ww3_dap
Unable to match dtype for time. Going to assume string will work for now...
Unable to match dtype for forecast_reference_time. Going to assume string will work for now...
<opendap_protocol.protocol.Dataset at 0x1924c5a80>
print("".join(ww3_dap.dds()))
Dataset {
Float32 longitude[longitude = 381];
Float32 latitude[latitude = 351];
String time[time = 73];
String forecast_reference_time[forecast_reference_time = 1];
Grid {
Array:
Float32 hs[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} hs;
Grid {
Array:
Float32 t02[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} t02;
Grid {
Array:
Float32 t0m1[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} t0m1;
Grid {
Array:
Float32 t01[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} t01;
Grid {
Array:
Float32 fp[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} fp;
Grid {
Array:
Float32 dir[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} dir;
Grid {
Array:
Float32 dp[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} dp;
Grid {
Array:
Float32 phs0[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} phs0;
Grid {
Array:
Float32 phs1[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} phs1;
Grid {
Array:
Float32 phs2[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} phs2;
Grid {
Array:
Float32 ptp0[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} ptp0;
Grid {
Array:
Float32 ptp1[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} ptp1;
Grid {
Array:
Float32 ptp2[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} ptp2;
Grid {
Array:
Float32 pdir0[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} pdir0;
Grid {
Array:
Float32 pdir1[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} pdir1;
Grid {
Array:
Float32 pdir2[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} pdir2;
Grid {
Array:
Float32 pws0[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} pws0;
Grid {
Array:
Float32 pws1[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} pws1;
Grid {
Array:
Float32 pws2[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} pws2;
Grid {
Array:
Float32 tws[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} tws;
Grid {
Array:
Float32 pnr[forecast_reference_time = 1][time = 73][latitude = 351][longitude = 381];
Maps:
String forecast_reference_time[forecast_reference_time = 1];
String time[time = 73];
Float32 latitude[latitude = 351];
Float32 longitude[longitude = 381];
} pnr;
} ww3;
print("".join(ww3_dap.das()))
Attributes {
longitude {
String units "degree_east";
String long_name "longitude";
String standard_name "longitude";
String valid_min "-180.0";
String valid_max "360.0";
String axis "X";
}
latitude {
String units "degree_north";
String long_name "latitude";
String standard_name "latitude";
String valid_min "-90.0";
String valid_max "180.0";
String axis "Y";
}
time {
String long_name "julian day (UT)";
String standard_name "time";
String conventions "relative julian days with decimal part (as parts of the day )";
String axis "T";
}
forecast_reference_time {
String long_name "Forecast Reference Time";
String standard_name "forecast_reference_time";
String conventions "relative julian days with decimal part (as parts of the day )";
}
hs {
String long_name "significant height of wind and swell waves";
String standard_name "sea_surface_wave_significant_height";
String globwave_name "significant_wave_height";
String units "m";
String valid_min "0";
String valid_max "32000";
}
t02 {
String long_name "mean period T02";
String standard_name "sea_surface_wind_wave_mean_period_from_variance_spectral_density_second_frequency_moment";
String globwave_name "mean_period_t02";
String units "s";
String valid_min "0";
String valid_max "5000";
}
t0m1 {
String long_name "mean period T0m1";
String standard_name "sea_surface_wind_wave_mean_period_from_variance_spectral_density_inverse_frequency_moment";
String globwave_name "mean_period_t0m1";
String units "s";
String valid_min "0";
String valid_max "5000";
}
t01 {
String long_name "mean period T01";
String standard_name "sea_surface_wind_wave_mean_period_from_variance_spectral_density_first_frequency_moment";
String globwave_name "mean_period_t01";
String units "s";
String valid_min "0";
String valid_max "5000";
}
fp {
String long_name "wave peak frequency";
String standard_name "sea_surface_wave_peak_frequency";
String globwave_name "dominant_wave_frequency";
String units "s-1";
String valid_min "0";
String valid_max "10000";
}
dir {
String long_name "wave mean direction";
String standard_name "sea_surface_wave_from_direction";
String globwave_name "wave_from_direction";
String units "degree";
String valid_min "0";
String valid_max "3600";
}
dp {
String long_name "peak direction";
String standard_name "sea_surface_wave_peak_direction";
String globwave_name "dominant_wave_direction";
String units "degree";
String valid_min "0";
String valid_max "360";
}
phs0 {
String long_name "wave significant height partition 0";
String standard_name "sea_surface_wave_significant_height_partition_0";
String globwave_name "significant_wave_height_partition_0";
String units "m";
String valid_min "0";
String valid_max "32000";
}
phs1 {
String long_name "wave significant height partition 1";
String standard_name "sea_surface_wave_significant_height_partition_1";
String globwave_name "significant_wave_height_partition_1";
String units "m";
String valid_min "0";
String valid_max "32000";
}
phs2 {
String long_name "wave significant height partition 2";
String standard_name "sea_surface_wave_significant_height_partition_2";
String globwave_name "significant_wave_height_partition_2";
String units "m";
String valid_min "0";
String valid_max "32000";
}
ptp0 {
String long_name "peak period partition 0";
String standard_name "sea_surface_wave_peak_period_partition_0";
String globwave_name "dominant_wave_period_partition_0";
String units "s";
String valid_min "0";
String valid_max "10000";
}
ptp1 {
String long_name "peak period partition 1";
String standard_name "sea_surface_wave_peak_period_partition_1";
String globwave_name "dominant_wave_period_partition_1";
String units "s";
String valid_min "0";
String valid_max "10000";
}
ptp2 {
String long_name "peak period partition 2";
String standard_name "sea_surface_wave_peak_period_partition_2";
String globwave_name "dominant_wave_period_partition_2";
String units "s";
String valid_min "0";
String valid_max "10000";
}
pdir0 {
String long_name "wave mean direction partition 0";
String standard_name "sea_surface_wave_from_direction_partition_0";
String globwave_name "wave_from_direction_partition_0";
String units "degree";
String valid_min "0";
String valid_max "360";
}
pdir1 {
String long_name "wave mean direction partition 1";
String standard_name "sea_surface_wave_from_direction_partition_1";
String globwave_name "wave_from_direction_partition_1";
String units "degree";
String valid_min "0";
String valid_max "360";
}
pdir2 {
String long_name "wave mean direction partition 2";
String standard_name "sea_surface_wave_from_direction_partition_2";
String globwave_name "wave_from_direction_partition_2";
String units "degree";
String valid_min "0";
String valid_max "360";
}
pws0 {
String long_name "wind sea fraction in partition 0";
String standard_name "wind_sea_fraction_in_partition_0";
String globwave_name "wind_sea_fraction_in_partition_0";
String units "1";
String valid_min "0";
String valid_max "1000";
}
pws1 {
String long_name "wind sea fraction in partition 1";
String standard_name "wind_sea_fraction_in_partition_1";
String globwave_name "wind_sea_fraction_in_partition_1";
String units "1";
String valid_min "0";
String valid_max "1000";
}
pws2 {
String long_name "wind sea fraction in partition 2";
String standard_name "wind_sea_fraction_in_partition_2";
String globwave_name "wind_sea_fraction_in_partition_2";
String units "1";
String valid_min "0";
String valid_max "1000";
}
tws {
String long_name "wind sea fraction";
String standard_name "wind_sea_fraction";
String globwave_name "wind_sea_fraction";
String units "1";
String valid_min "0";
String valid_max "1000";
}
pnr {
String long_name "number of wave partitions";
String standard_name "number_of_wave_partitions";
String globwave_name "number_of_wave_partitions";
String units "1";
String valid_min "0";
String valid_max "100";
}
String WAVEWATCH_III_version_number "5.16";
String WAVEWATCH_III_switches "F90 SHRD NOGRB NOPA LRB4 NC4 TRKNC PR3 UQ FLX0 LN1 ST4 STAB0 NL1 BT1 DB1 MLIM IC0 IS0 REF0 WNT2 WNX2 RWND CRT1 CRX1 TR0 BS0 XX0 O0 O1 O2 O3 O4 O5 O6 O7";
String SIN4_namelist_parameter_BETAMAX "1.649999976158142";
String product_name "ww3.20220411.nc";
String area "NW Atlantic 6 arc min grd2";
String latitude_resolution "0.1000000";
String longitude_resolution "0.1000000";
String southernmost_latitude "20.00000";
String northernmost_latitude "55.00000";
String westernmost_longitude "-93.00000";
String easternmost_longitude "-55.00000";
String minimum_altitude "-12000 m";
String maximum_altitude "9000 m";
String altitude_resolution "n/a";
String start_date "2022-04-11 12:00:00";
String stop_date "2022-04-11 23:00:00";
}
This will break things, but probably for very good reasons (too much data in the stream for Jupyer to handle)!
print(b"".join(dataset.dods()))
That seemed to work. Lets move those into a router and give it a shot.
Loading with OpenDAP#
Now we can try loading the dataset via OpenDAP.
ds_from_dap = xr.open_dataset("http://0.0.0.0:9005/datasets/ww3/opendap")
ds_from_dap
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:106, in NetCDF4ArrayWrapper._getitem(self, key)
105 original_array = self.get_array(needs_lock=False)
--> 106 array = getitem(original_array, key)
107 except IndexError:
108 # Catch IndexError in netCDF4 and return a more informative
109 # error message. This is most often called when an unsorted
110 # indexer is used before the data is loaded from disk.
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/backends/common.py:64, in robust_getitem(array, key, catch, max_retries, initial_delay)
63 try:
---> 64 return array[key]
65 except catch:
File src/netCDF4/_netCDF4.pyx:4406, in netCDF4._netCDF4.Variable.__getitem__()
File src/netCDF4/_netCDF4.pyx:5348, in netCDF4._netCDF4.Variable._get()
IndexError: index exceeds dimension bounds
During handling of the above exception, another exception occurred:
IndexError Traceback (most recent call last)
Input In [78], in <cell line: 1>()
----> 1 ds_from_dap = xr.open_dataset("http://0.0.0.0:9005/datasets/ww3/opendap")
2 ds_from_dap
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/backends/api.py:495, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, backend_kwargs, *args, **kwargs)
483 decoders = _resolve_decoders_kwargs(
484 decode_cf,
485 open_backend_dataset_parameters=backend.open_dataset_parameters,
(...)
491 decode_coords=decode_coords,
492 )
494 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 495 backend_ds = backend.open_dataset(
496 filename_or_obj,
497 drop_variables=drop_variables,
498 **decoders,
499 **kwargs,
500 )
501 ds = _dataset_from_backend_dataset(
502 backend_ds,
503 filename_or_obj,
(...)
510 **kwargs,
511 )
512 return ds
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:567, in NetCDF4BackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, format, clobber, diskless, persist, lock, autoclose)
565 store_entrypoint = StoreBackendEntrypoint()
566 with close_on_error(store):
--> 567 ds = store_entrypoint.open_dataset(
568 store,
569 mask_and_scale=mask_and_scale,
570 decode_times=decode_times,
571 concat_characters=concat_characters,
572 decode_coords=decode_coords,
573 drop_variables=drop_variables,
574 use_cftime=use_cftime,
575 decode_timedelta=decode_timedelta,
576 )
577 return ds
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/backends/store.py:39, in StoreBackendEntrypoint.open_dataset(self, store, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta)
25 encoding = store.get_encoding()
27 vars, attrs, coord_names = conventions.decode_cf_variables(
28 vars,
29 attrs,
(...)
36 decode_timedelta=decode_timedelta,
37 )
---> 39 ds = Dataset(vars, attrs=attrs)
40 ds = ds.set_coords(coord_names.intersection(vars))
41 ds.set_close(store.close)
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/dataset.py:750, in Dataset.__init__(self, data_vars, coords, attrs)
747 if isinstance(coords, Dataset):
748 coords = coords.variables
--> 750 variables, coord_names, dims, indexes, _ = merge_data_and_coords(
751 data_vars, coords, compat="broadcast_equals"
752 )
754 self._attrs = dict(attrs) if attrs is not None else None
755 self._close = None
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/merge.py:483, in merge_data_and_coords(data, coords, compat, join)
481 explicit_coords = coords.keys()
482 indexes = dict(_extract_indexes_from_coords(coords))
--> 483 return merge_core(
484 objects, compat, join, explicit_coords=explicit_coords, indexes=indexes
485 )
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/merge.py:632, in merge_core(objects, compat, join, combine_attrs, priority_arg, explicit_coords, indexes, fill_value)
628 coerced = coerce_pandas_values(objects)
629 aligned = deep_align(
630 coerced, join=join, copy=False, indexes=indexes, fill_value=fill_value
631 )
--> 632 collected = collect_variables_and_indexes(aligned)
634 prioritized = _get_priority_vars_and_indexes(aligned, priority_arg, compat=compat)
635 variables, out_indexes = merge_collected(
636 collected, prioritized, compat=compat, combine_attrs=combine_attrs
637 )
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/merge.py:291, in collect_variables_and_indexes(list_of_mappings)
288 indexes.pop(name, None)
289 append_all(coords, indexes)
--> 291 variable = as_variable(variable, name=name)
293 if variable.dims == (name,):
294 idx_variable = variable.to_index_variable()
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/variable.py:154, in as_variable(obj, name)
148 if obj.ndim != 1:
149 raise MissingDimensionsError(
150 f"{name!r} has more than 1-dimension and the same name as one of its "
151 f"dimensions {obj.dims!r}. xarray disallows such variables because they "
152 "conflict with the coordinates used to label dimensions."
153 )
--> 154 obj = obj.to_index_variable()
156 return obj
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/variable.py:528, in Variable.to_index_variable(self)
526 def to_index_variable(self):
527 """Return this variable as an xarray.IndexVariable"""
--> 528 return IndexVariable(
529 self.dims, self._data, self._attrs, encoding=self._encoding, fastpath=True
530 )
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/variable.py:2672, in IndexVariable.__init__(self, dims, data, attrs, encoding, fastpath)
2670 # Unlike in Variable, always eagerly load values into memory
2671 if not isinstance(self._data, PandasIndexingAdapter):
-> 2672 self._data = PandasIndexingAdapter(self._data)
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/indexing.py:1272, in PandasIndexingAdapter.__init__(self, array, dtype)
1271 def __init__(self, array: pd.Index, dtype: DTypeLike = None):
-> 1272 self.array = utils.safe_cast_to_index(array)
1274 if dtype is None:
1275 if isinstance(array, pd.PeriodIndex):
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/utils.py:115, in safe_cast_to_index(array)
113 if hasattr(array, "dtype") and array.dtype.kind == "O":
114 kwargs["dtype"] = object
--> 115 index = pd.Index(np.asarray(array), **kwargs)
116 return _maybe_cast_to_cftimeindex(index)
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/indexing.py:423, in LazilyIndexedArray.__array__(self, dtype)
421 def __array__(self, dtype=None):
422 array = as_indexable(self.array)
--> 423 return np.asarray(array[self.key], dtype=None)
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:93, in NetCDF4ArrayWrapper.__getitem__(self, key)
92 def __getitem__(self, key):
---> 93 return indexing.explicit_indexing_adapter(
94 key, self.shape, indexing.IndexingSupport.OUTER, self._getitem
95 )
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/core/indexing.py:712, in explicit_indexing_adapter(key, shape, indexing_support, raw_indexing_method)
690 """Support explicit indexing by delegating to a raw indexing method.
691
692 Outer and/or vectorized indexers are supported by indexing a second time
(...)
709 Indexing result, in the form of a duck numpy-array.
710 """
711 raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support)
--> 712 result = raw_indexing_method(raw_key.tuple)
713 if numpy_indices.tuple:
714 # index the loaded np.ndarray
715 result = NumpyIndexingAdapter(np.asarray(result))[numpy_indices]
File /usr/local/miniconda3/envs/code-sprint-2022/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:116, in NetCDF4ArrayWrapper._getitem(self, key)
107 except IndexError:
108 # Catch IndexError in netCDF4 and return a more informative
109 # error message. This is most often called when an unsorted
110 # indexer is used before the data is loaded from disk.
111 msg = (
112 "The indexing operation you are attempting to perform "
113 "is not valid on netCDF4.Variable object. Try loading "
114 "your data into memory first by calling .load()."
115 )
--> 116 raise IndexError(msg)
117 return array
IndexError: The indexing operation you are attempting to perform is not valid on netCDF4.Variable object. Try loading your data into memory first by calling .load().
It fails, but it identifies that an index is the issue. We probably don’t have the right datatypes on the datetime dimensions, so we at least have a place to start exploring.
forecast_time = ds["forecast_reference_time"]
forecast_time
<xarray.DataArray 'forecast_reference_time' (forecast_reference_time: 1)> array(['2022-04-11T12:00:00.000000000'], dtype='datetime64[ns]') Coordinates: * forecast_reference_time (forecast_reference_time) datetime64[ns] 2022-04... Attributes: long_name: Forecast Reference Time standard_name: forecast_reference_time conventions: relative julian days with decimal part (as parts of the d...
forecast_time.dtype
dtype('<M8[ns]')
forecast_time.data
array(['2022-04-11T12:00:00.000000000'], dtype='datetime64[ns]')
forecast_time.data.dtype
dtype('<M8[ns]')
forecast_time.encoding
{'zlib': True,
'shuffle': True,
'complevel': 1,
'fletcher32': False,
'contiguous': False,
'chunksizes': (512,),
'source': '/Users/akerney/GMRI/restful-grids/datasets/ww3_72_east_coast_2022041112.nc',
'original_shape': (1,),
'dtype': dtype('float64'),
'_FillValue': nan,
'units': 'days since 1990-01-01',
'calendar': 'proleptic_gregorian'}
forecast_time.encoding['dtype']
dtype('float64')
def dap_dtype(da: xr.DataArray):
""" Return a DAP type for the xr.DataArray """
try:
return dtype_dap[da.encoding["dtype"]]
except KeyError as e:
logger.warning(
f"Unable to match dtype for {da.name}. Going to assume string will work for now... ({e})"
)
return dap.String
Changed things to use the encoded dtype, and…
ds_from_dap = xr.open_dataset("http://0.0.0.0:9005/datasets/ww3/opendap")
ds_from_dap
<xarray.Dataset> Dimensions: (longitude: 381, latitude: 351, time: 73, forecast_reference_time: 1) Coordinates: * longitude (longitude) float32 -93.0 -92.9 ... -55.1 -55.0 * latitude (latitude) float32 20.0 20.1 20.2 ... 54.9 55.0 * time (time) float64 1.65e+18 1.65e+18 ... 1.65e+18 * forecast_reference_time (forecast_reference_time) float64 1.65e+18 Data variables: (12/21) hs (forecast_reference_time, time, latitude, longitude) float32 ... t02 (forecast_reference_time, time, latitude, longitude) float32 ... t0m1 (forecast_reference_time, time, latitude, longitude) float32 ... t01 (forecast_reference_time, time, latitude, longitude) float32 ... fp (forecast_reference_time, time, latitude, longitude) float32 ... dir (forecast_reference_time, time, latitude, longitude) float32 ... ... ... pdir2 (forecast_reference_time, time, latitude, longitude) float32 ... pws0 (forecast_reference_time, time, latitude, longitude) float32 ... pws1 (forecast_reference_time, time, latitude, longitude) float32 ... pws2 (forecast_reference_time, time, latitude, longitude) float32 ... tws (forecast_reference_time, time, latitude, longitude) float32 ... pnr (forecast_reference_time, time, latitude, longitude) float32 ... Attributes: (12/16) WAVEWATCH_III_version_number: 5.16 WAVEWATCH_III_switches: F90 SHRD NOGRB NOPA LRB4 NC4 TRKNC PR3 ... SIN4_namelist_parameter_BETAMAX: 1.649999976158142 product_name: ww3.20220411.nc area: NW Atlantic 6 arc min grd2 latitude_resolution: 0.1000000 ... ... easternmost_longitude: -55.00000 minimum_altitude: -12000 m maximum_altitude: 9000 m altitude_resolution: n/a start_date: 2022-04-11 12:00:00 stop_date: 2022-04-11 23:00:00
Holy crap, that worked.