virocon.jointmodels module

Models for the joint probability distribution.

class virocon.jointmodels.GlobalHierarchicalModel(dist_descriptions)[source]

Bases: virocon.jointmodels.MultivariateModel

Hierarchical probabilistic model.

Probabilistic model that covers the complete range of an environmental variable (“global”), following a particular hierarchical dependence structure. The factorization describes a hierarchy where a random variable with index i can only depend upon random variables with indices less than i 1 .

Parameters

dist_descriptions (dict) – Description of the distributions.

distributions

The distributions used in the GlobalHierachicalModel.

Type

list

conditional_on

Indicates the dependencies between the variables of the model. One entry per distribution/dimension. Contains either None or int. If the ith entry is None, the ith distribution is unconditional. If the ith entry is an int j, the ith distribution depends on the jth dimension.

Type

list

interval_slicers

One interval slicer per dimension. The interval slicer used for slicing the intervals of the corresponding dimension, when necessary during fitting.

Type

list

n_dim

The number of dimensions, i.e. the number of variables of the model.

Type

int

References

1(1,2)

Haselsteiner, A.F.; Sander, A.; Ohlendorf, J.H.; Thoben, K.D. (2020) Global hierarchical models for wind and wave contours: physical interpretations of the dependence functions. OMAE 2020, Fort Lauderdale, USA. Proceedings of the 39th International Conference on Ocean, Offshore and Arctic Engineering.

Examples

Create a Hs-Tz model and fit it to the available data. The following example follows the methodology of OMAE2020 1 .

Example 1.1:

Load the predefined OMAE 2020 model of Hs-Tz.

>>> from virocon import (GlobalHierarchicalModel, get_OMAE2020_Hs_Tz,
...                      read_ec_benchmark_dataset)
>>> data = read_ec_benchmark_dataset("datasets/ec-benchmark_dataset_D_1year.txt")
>>> dist_descriptions, fit_descriptions, semantics = get_OMAE2020_Hs_Tz()
>>> ghm = GlobalHierarchicalModel(dist_descriptions)
>>> ghm.fit(data, fit_descriptions=fit_descriptions)

Example 1.2:

Create the same OMEA 2020 model manually.

>>> from virocon import (DependenceFunction, ExponentiatedWeibullDistribution,
...                      LogNormalDistribution, WidthOfIntervalSlicer)
>>> def _asymdecrease3(x, a, b, c):
...     return a + b / (1 + c * x)
>>> def _lnsquare2(x, a, b, c):
...     return np.log(a + b * np.sqrt(np.divide(x, 9.81)))
>>> bounds = [(0, None),
...           (0, None),
...           (None, None)]
>>> sigma_dep = DependenceFunction(_asymdecrease3, bounds=bounds)
>>> mu_dep = DependenceFunction(_lnsquare2, bounds=bounds)
>>> dist_description_hs = {"distribution" : ExponentiatedWeibullDistribution(),
...                        "intervals" : WidthOfIntervalSlicer(width=0.5,
...                                                            min_n_points=50)
...                       }
>>> dist_description_tz = {"distribution" : LogNormalDistribution(),
...                        "conditional_on" : 0,
...                        "parameters" : {"sigma" : sigma_dep,
...                                        "mu": mu_dep,
...                                        },
...                       }
>>> dist_descriptions = [dist_description_hs, dist_description_tz]
>>> fit_description_hs = {"method" : "wlsq", "weights" : "quadratic"}
>>> fit_descriptions = [fit_description_hs, None]
>>> semantics = {"names" : ["Significant wave height", "Zero-crossing wave period"],
...              "symbols" : ["H_s", "T_z"],
...              "units" : ["m", "s"]
...              }
>>> ghm = GlobalHierarchicalModel(dist_descriptions)
>>> ghm.fit(data, fit_descriptions=fit_descriptions)
conditional_cdf(x, dim, given, *, random_state=None)[source]
conditional_icdf(p, dim, given, *, random_state=None)[source]
draw_sample(n, *, random_state=None)[source]

Draw a random sample of size n.

Parameters
  • n (int) – Sample size.

  • random_state ({None, int, numpy.random.Generator}, optional) – Can be used to draw a reproducible sample.

fit(data, fit_descriptions=None)[source]

Fit joint model to data.

Method of estimating the parameters of a probability distribution to given data.

Parameters
  • data (array-like) – The data that should be used to fit the joint model. Shape: (number of realizations, n_dim)

  • fit_description (dict) – Description of the fit method. Defaults to None.

marginal_cdf(x, dim)[source]

Marginal cumulative distribution function.

Parameters
  • x (array_like) – Points at which the cdf is evaluated. Shape: 1-dimensional

  • dim (int) – The dimension for which the marginal is calculated.

marginal_icdf(p, dim, precision_factor=1)[source]

Marginal inverse cumulative distribution function.

Estimates the marginal icdf by drawing a Monte-Carlo sample.

Parameters
  • p (array_like) – Probabilities for which the icdf is evaluated. Shape: 1-dimensional

  • dim (int) – The dimension for which the marginal is calculated.

  • precision_factor (float) – Precision factor that determines the size of the sample to draw. A sample is drawn of which on average precision_factor * 100 realizations exceed the quantile. Minimum sample size is 100000. Defaults to 1.0

marginal_pdf(x, dim)[source]

Marginal probability density function.

Parameters
  • x (array_like) – Points at which the pdf is evaluated. Shape: 1-dimensional

  • dim (int) – The dimension for which the marginal is calculated.

pdf(x)[source]

Probability density function.

Parameters

x (array_like) – Points at which the pdf is evaluated. Shape: (n, n_dim), where n is the number of points at which the pdf should be evaluated.

class virocon.jointmodels.TransformedModel(model: virocon.jointmodels.GlobalHierarchicalModel, transform: Callable, inverse: Callable, jacobian: Callable, precision_factor: float = 1.0, random_state: Optional[int] = None)[source]

Bases: virocon.jointmodels.MultivariateModel

__init__(model: virocon.jointmodels.GlobalHierarchicalModel, transform: Callable, inverse: Callable, jacobian: Callable, precision_factor: float = 1.0, random_state: Optional[int] = None)[source]

A joint distribution that was defined in another variable space.

Parameters
  • model (GlobalHierarchicalModel) – Joint distribution in original variable space

  • transform (Callable) – Function to transform this model back to original variable space

  • inverse (Callable) – Function to transform from the original variable space to this model’s space

  • jacobian (Callable) – jacobian matrix, see page 31 in DOI: 10.26092/elib/2181

  • precision_factor (float, optional) – Lower precision results in faster computation. Defaults to 1.0.

  • random_state (int, optional) – Can be used to fix random numbers. Defaults to None.

cdf(x)[source]

Cumulative distribution function.

Parameters

x (array_like) – Points at which the cdf is evaluated. Shape: (n, n_dim), where n is the number of points at which the cdf should be evaluated.

draw_sample(n)[source]

Draw a random sample of size n.

Parameters

n (int) – Sample size.

empirical_cdf(x, sample=None)[source]
fit(data, *args, **kwargs)[source]

Fit joint model to data.

Method of estimating the parameters of a probability distribution to given data.

Parameters

data (array-like) – The data that should be used to fit the joint model. Shape: (number of realizations, n_dim)

marginal_cdf(x, dim)[source]

Marginal cumulative distribution function.

Parameters
  • x (array_like) – Points at which the cdf is evaluated. Shape: 1-dimensional

  • dim (int) – The dimension for which the marginal is calculated.

marginal_pdf(x, dim)[source]

Marginal probability density function.

Parameters
  • x (array_like) – Points at which the pdf is evaluated. Shape: 1-dimensional

  • dim (int) – The dimension for which the marginal is calculated.

pdf(x)[source]

Probability density function.

Parameters

x (array_like) – Points at which the pdf is evaluated. Shape: (n, n_dim), where n is the number of points at which the pdf should be evaluated.

property sample