daops.ops package

Operations module for the DAOPS package.

Submodules

daops.ops.average module

Operations for averaging data over dimensions, shape or time.

daops.ops.average.average_over_dims(collection, dims=None, ignore_undetected_dims=False, output_dir=None, output_type='netcdf', split_method='time:auto', file_namer='standard', apply_fixes=True)[source]

Average input dataset according to indicated dimensions.

Can be averaged over multiple dimensions.

Parameters:
  • collection (Collection of datasets to process, sequence or string of comma separated dataset identifiers.)

  • dims (list of dims to average over or None.)

  • ignore_undetected_dims (Boolean. If False exception will be raised if requested dims do not exist in the dataset)

  • If True missing dims will be ignored.

  • output_dir (str or path like object describing output directory for output files.)

  • output_type ({“netcdf”, “nc”, “zarr”, “xarray”})

  • split_method ({“time:auto”})

  • file_namer ({“standard”, “simple”})

  • apply_fixes (Boolean. If True fixes will be applied to datasets if needed. Default is True.)

Returns:

List of outputs in the selected type (a list of xarray Datasets or file paths.)

Examples

collection: (“cmip6.ukesm1.r1.gn.tasmax.v20200101”)
dims: [“time”, “lat”]
ignore_undetected_dims: (-5.,49.,10.,65)
output_type: “netcdf”
output_dir: “/cache/wps/procs/req0111”
split_method: “time:auto”
file_namer: “standard”
apply_fixes: True
daops.ops.average.average_shape(collection, shape, variable=None, output_dir=None, output_type='netcdf', split_method='time:auto', file_namer='standard', apply_fixes=True)[source]

Average input dataset over indicated shape.

Parameters:
  • collection (Collection of datasets to process, sequence or string of comma separated dataset identifiers.)

  • shape (Path to shape file, or directly a geodataframe to perform average within.)

  • variable (Variables to average. If None, average over all data variables.)

  • output_dir (str or path like object describing output directory for output files.)

  • output_type ({“netcdf”, “nc”, “zarr”, “xarray”})

  • split_method ({“time:auto”})

  • file_namer ({“standard”, “simple”})

  • apply_fixes (Boolean. If True fixes will be applied to datasets if needed. Default is True.)

Returns:

List of outputs in the selected type (a list of xarray Datasets or file paths.)

Examples

collection: (“cmip6.cmip.cas.fgoals-g3.historical.r1i1p1fi.Amon.tas.gn.v20190818”)
shape: “path_to_shape”
ignore_undetected_dims: (-5.,49.,10.,65)
output_type: “netcdf”
output_dir: “/cache/wps/procs/req0111”
split_method: “time:auto”
file_namer: “standard”
apply_fixes: True
daops.ops.average.average_time(collection, freq='year', output_dir=None, output_type='netcdf', split_method='time:auto', file_namer='standard', apply_fixes=True)[source]

Average input dataset according to indicated frequency.

Parameters:
  • collection (Collection of datasets to process, sequence or string of comma separated dataset identifiers.)

  • freq (Frequency to average over {“day”, “month”, “year”})

  • output_dir (str or path like object describing output directory for output files.)

  • output_type ({“netcdf”, “nc”, “zarr”, “xarray”})

  • split_method ({“time:auto”})

  • file_namer ({“standard”, “simple”})

  • apply_fixes (Boolean. If True fixes will be applied to datasets if needed. Default is True.)

Returns:

List of outputs in the selected type (a list of xarray Datasets or file paths.)

Examples

collection: (“cmip6.ukesm1.r1.gn.tasmax.v20200101”,)
freq: “month”
output_type: “netcdf”
output_dir: “/cache/wps/procs/req0111”
split_method: “time:auto”
file_namer: “standard”
apply_fixes: True

daops.ops.base module

Base class for all Operations.

class daops.ops.base.Operation(collection, file_namer='standard', split_method='time:auto', output_dir=None, output_type='netcdf', apply_fixes=True, **params)[source]

Bases: object

Base class for all Operations.

_consolidate_collection()[source]

Take in the collection object and finds the file paths relating to each input dataset.

If a time range has been supplied then only the files relating to this time range are recorded. Set the result to self.collection.

_resolve_params(collection, **params)[source]

Resolve the operation-specific input parameters to self.params and parameterise collection parameter and set to self.collection.

calculate()[source]

Process the input and calculate the result using clisops.

It then returns the result as a daops.normalise.ResultSet object.

get_operation_callable()[source]

Return the operation callable from clisops.

daops.ops.regrid module

Regrid operation.

daops.ops.regrid.regrid(collection, method='nn', adaptive_masking_threshold=0.5, grid='1deg', output_dir=None, output_type='netcdf', split_method='time:auto', file_namer='standard', apply_fixes=True)[source]

Regrid input dataset according to specified method and output grid.

The adaptive masking threshold can also be specified.

Parameters:
  • collection (Collection of datasets to process, sequence or string of comma separated dataset identifiers.)

  • method (The method by which to regrid.)

  • adaptive_masking_threshold

  • grid (The desired output grid.)

  • output_dir (str or path like object describing output directory for output files.)

  • output_type ({“netcdf”, “nc”, “zarr”, “xarray”})

  • split_method ({“time:auto”})

  • file_namer ({“standard”, “simple”})

  • apply_fixes (Boolean. If True fixes will be applied to datasets if needed. Default is True.)

Returns:

List of outputs in the selected type (a list of xarray Datasets or file paths.)

Examples

collection: (“cmip6.ukesm1.r1.gn.tasmax.v20200101”,)
method: “nn”
adaptive_masking_threshold: 0.5
grid: “1deg”
output_type: “netcdf”
output_dir: “/cache/wps/procs/req0111”
split_method: “time:auto”
file_namer: “standard”
apply_fixes: True

daops.ops.subset module

Subset operation.

daops.ops.subset.subset(collection, time=None, area=None, level=None, time_components=None, output_dir=None, output_type='netcdf', split_method='time:auto', file_namer='standard', apply_fixes=True)[source]

Subset input dataset according to parameters.

Can be subsetted by level, area, and time.

Parameters:
  • collection (Collection of datasets to process, sequence or string of) – comma-separated dataset identifiers.

  • time (Time interval (defined by start/end) or time series (a sequence of) – datetime values) to subset over. Datetimes are typically provided as strings.

  • area (Area to subset over, sequence or string of comma separated lat and lon) – bounds. Must contain 4 values.

  • level (Level interval (defined by start/end) or level series (a sequence of) – values) to subset over. Levels are typically provided as integers or floats.

  • time_compoonents (Time components to filter on: year, month, day, hour, minute, second)

  • output_dir (str or path like object describing output directory for output files.)

  • output_type ({“netcdf”, “nc”, “zarr”, “xarray”})

  • split_method ({“time:auto”})

  • file_namer ({“standard”, “simple”})

  • apply_fixes (Boolean. If True fixes will be applied to datasets if needed. Default is True.)

Returns:

List of outputs in the selected type (a list of xarray Datasets or file paths.)

Examples

collection: (“cmip6.ukesm1.r1.gn.tasmax.v20200101”,)
time: (“1999-01-01T00:00:00”, “2100-12-30T00:00:00”)
area: (-5.,49.,10.,65)
level: (1000.,)
time_components: {“month”: [“dec”, “jan”, “feb”]}
output_type: “netcdf”
output_dir: “/cache/wps/procs/req0111”
split_method: “time:auto”
file_namer: “standard”
apply_fixes: True