UnaryCovarianceTW#

class hypercoil.nn.cov.UnaryCovarianceTW(estimator: Callable, dim: int, min_lag: int = 0, max_lag: int = 0, out_channels: int = 1, rowvar: bool = True, biased: bool = False, ddof: int | None = None, l2: float = 0, *, key: 'jax.random.PRNGKey' | None = None)[source]#

Covariance measures of a single tensor, with a single learnable weight for each time lag.

By default, the weight is initialised following a double exponential function of lag, such that the weights at 0 lag are \(e^{{-|0|}} = 1\), the weights at 1 or -1 lag are \(e^{{-|1|}}\), etc. Note that if the maximum lag is 0, this default initialisation will be equivalent to an unweighted covariance. The input tensor is interpreted as a set of multivariate observations. A covariance estimator computes some measure of statistical dependence among the variables in each observation, with the potential addition of stochastic noise and dropout to re-weight observations and regularise the model.

Dimension:
Input : \((N, *, C, O)\)

N denotes batch size, * denotes any number of intervening dimensions, C denotes number of data channels or variables, O denotes number of time points or observations per channel

Output : \((N, *, W, C, C)\)

W denotes number of sets of weights.

Parameters:
estimatorcallable

Covariance estimator, e.g. from hypercoil.functional.cov. The estimator must be unary: it should accept a single tensor rather than multiple tensors. Some available options are:

dimint

Number of observations O per data instance. This determines the dimension of each slice of the covariance weight tensor.

min_lag , max_lagint or None (default 0)

Minimum and maximum lags to include in the weight matrix. If these parameters are not None, the structure of the weight matrix is constrained to allow nonzero entries only along diagonals that are a maximum offset of (min_lag, max_lag) from the main diagonal. The default value of 0 permits weights only along the main diagonal.

out_channelsint (default 1)

Number of weight sets W to include. For each weight set, the module produces an output channel.

rowvarbool (default True)

Indicates that the last axis of the input tensor is the observation axis and the penultimate axis is the variable axis. If False, then this relationship is transposed.

biasedbool (default False)

Indicates that the biased normalisation (i.e., division by N in the unweighted case) should be performed. By default, normalisation of the covariance is unbiased (i.e., division by N - 1).

ddofint or None (default None)

Degrees of freedom for normalisation. If this is specified, it overrides the normalisation factor automatically determined using the biased parameter.

l2nonnegative float (default 0)

L2 regularisation term to add to the maximum likelihood estimate of the covariance matrix. This can be set to a positive value to obtain intermediate for estimating the regularised inverse covariance or to an ensure that the covariance matrix is non-singular (if, for instance, it needs to be inverted or projected into a tangent space).

Attributes:
weight_col, weight_rowTensor \((W, L)\)

Toeplitz matrix generators for the columns (lag) and rows (lead) of the weight matrix. L denotes the maximum lag. These parameters are repeated along each diagonal of the weight matrix up to the maximum lag. The weight generators are initialised as exponentials over negative integers with a maximum of 1 at the origin (zero lag; \(e^0\)). The weight attribute is a property that is generated from these parameters as needed.

maskTensor \((W, O, O)\) or None

Boolean-valued tensor indicating the entries of the weight tensor that are permitted to take nonzero values. This is determined by the specified max_lag parameter at initialisation.

weightTensor \((W, O, O)\)

Tensor containing importance or coupling weights for the observations. If this tensor is 1-dimensional, each entry weights the corresponding observation in the covariance computation. If it is 2-dimensional, then it must be square, symmetric, and positive semidefinite. In this case, diagonal entries again correspond to relative importances, while off-diagonal entries indicate coupling factors. For instance, a banded or multi-diagonal tensor can be used to specify inter-temporal coupling for a time series covariance.