API¶

Subset operation¶

daops.ops.subset.subset(collection, time=None, area=None, level=None, output_dir=None, output_type='netcdf', split_method='time:auto', file_namer='standard')[source]¶

Subset input dataset according to parameters. Can be subsetted by level, area and time.

Parameters

collection (Collection of datasets to process, sequence or string of comma separated dataset identifiers.)
time (Time period - Time range to subset over, sequence of two time values or string of two / separated time values)
area (Area to subset over, sequence or string of comma separated lat and lon bounds. Must contain 4 values.)
level (Level range - Level values to subset over, sequence of two level values or string of two / separated level values)
output_dir (str or path like object describing output directory for output files.)
output_type ({“netcdf”, “nc”, “zarr”, “xarray”})
split_method ({“time:auto”})
file_namer ({“standard”, “simple”})

Returns

List of outputs in the selected type (a list of xarray Datasets or file paths.)

Examples

collection: (“cmip6.ukesm1.r1.gn.tasmax.v20200101”,)
time: (“1999-01-01T00:00:00”, “2100-12-30T00:00:00”)
area: (-5.,49.,10.,65)
level: (1000.,)
output_type: “netcdf”
output_dir: “/cache/wps/procs/req0111”
split_method: “time:decade”
file_namer: “facet_namer”

Utilities¶

daops.utils.consolidate.consolidate(collection, **kwargs)[source]

Finds the file paths relating to each input dataset. If a time range has been supplied then only the files relating to this time range are recorded.

Parameters

collection – (roocs_utils.CollectionParameter) The collection of datasets to process.
kwargs – Arguments of the operation taking place e.g. subset, average, or re-grid.

Returns

An ordered dictionary of each dataset from the collection argument and the file paths relating to it.

daops.utils.consolidate.convert_to_ds_id(dset)[source]

Converts the input dataset to a drs id form to use with the elasticsearch index.

Parameters: dset – Dataset to process. Formats currently accepted are file paths and paths to directories.
Returns: The ds id for the input dataset.

daops.utils.core.is_characterised(collection, require_all=False)[source]

Takes in a collection (an individual data reference or a sequence of them). Returns an ordered dictionary of a collection of ids with a boolean value for each stating whether the dataset has been characterised.

If require_all is True: return a single Boolean value.

Parameters

collection – one or more data references
require_all – Boolean to require that all must be characterised

Returns

Ordered Dictionary OR Boolean (if require_all is True)

daops.utils.core.is_dataref_characterised(dset)[source]

daops.utils.core.open_dataset(ds_id, file_paths)[source]

Opens an xarray Dataset and applies fixes if required. Fixes are applied to the data either before or after the dataset is opened. Whether a fix is a ‘pre-processor’ or ‘post-processor’ is defined in the fix itself.

Parameters

ds_id – Dataset identifier in the form of a drs id e.g. cmip5.output1.INM.inmcm4.rcp45.mon.ocean.Omon.r1i1p1.latest.zostoga
file_paths – (list) The file paths corresponding to the ds id.

Returns

xarray Dataset with fixes applied to the data.

class daops.utils.fixer.Fixer(ds_id)[source]

Bases: object

Fixer class to look up fixes to apply to input dataset from the elastic search index. Gathers fixes into pre and post processors. Pre-process fixes are chained together to allow them to be executed with one call.

class daops.utils.fixer.FuncChainer(funcs)[source]

Bases: object

Chains functions together to allow them to be executed in one call.

class daops.utils.normalise.ResultSet(inputs=None)[source]

Bases: object

A class to hold the results from an operation e.g. subset

add(dset, result)[source]: Adds outputs to an ordered dictionary with the ds id as the key. If the output is a file path this is also added to the file_paths variable so a list of file paths can be accessed independently.

daops.utils.normalise.normalise(collection)[source]

Takes file paths and opens and fixes the dataset they make up.

Parameters: collection – Ordered dictionary of ds ids and their related file paths.
Returns: An ordered dictionary of ds ids and their fixed xarray Dataset.

Data Utilities¶

daops.data_utils.coord_utils.add_scalar_coord(ds, **operands)[source]

Parameters

ds – Xarray Dataset
operands – (dict) Arguments for fix. Id, value and data type of scalar coordinate to add.

Returns

Xarray Dataset

daops.data_utils.coord_utils.squeeze_dims(ds, **operands)[source]

Parameters

ds – Xarray Dataset
operands – (dict) Arguments for fix. Dims (list) to remove.

Returns

Xarray Dataset

Processor¶

daops.processor.dispatch(operation, dset, **kwargs)[source]

daops.processor.process(operation, dset, mode='serial', **kwargs)[source]: Runs the processing operation on the dataset in the correct mode (in series or parallel).