rbf_kernel#

hypercoil.functional.kernel.rbf_kernel(X0: Tensor, X1: Tensor | None = None, theta: Tensor | None = None, gamma: float | None = None) Tensor[source]#

Parameterised RBF kernel between input tensors.

For tensors \(X_0\) and \(X_1\) containing features in column vectors, the parameterised RBF kernel is

\(K_{\theta}(X_0, X_1) = e^{\gamma (X_0 - X_1)^\intercal \theta (X_0 - X_1)}\)

where \(\theta\) is the kernel parameter, \(\gamma\) is a scaling coefficient, and \(X_0 - X_1\) contains all pairwise differences between vectors in \(X_0\) and \(X_1\). The kernel parameter \(\theta\) can also be interpreted as an inverse covariance.

This is the same as gaussian_kernel() but is parameterised in terms of \(\gamma\) rather than \(\sigma\).

Note

The inputs here are assumed to contain features in row vectors and observations in columns. This differs from the convention frequently used in the literature. However, this has the benefit of direct compatibility with the top-k sparse tensor format.

Dimension:
X0 : \((*, N, P)\) or \((N, P, *)\)

N denotes number of observations, P denotes number of features, * denotes any number of additional dimensions. If the input is dense, then the last dimensions should be N and P; if it is sparse, then the first dimensions should be N and P.

X1 : \((*, M, P)\) or \((M, P, *)\)

M denotes number of observations.

theta : \((*, P, P)\) or \((*, P)\)

As above.

Output : \((*, M, N)\) or \((M, N, *)\)

As above.

Parameters:
X0tensor

A feature tensor.

X1tensor or None

Second feature tensor. If not explicitly provided, the kernel of X with itself is computed.

thetatensor or None

Kernel parameter (generally a representation of a positive definite matrix). If not provided, defaults to identity (an unparameterised kernel). If the last two dimensions are the same size, they are used as a matrix parameter; if they are not, the final axis is instead used as the diagonal of the matrix.

gammafloat or None (default None)

Scaling coefficient. If not explicitly specified, this is automatically set to \(\frac{1}{P}\).

Returns:
tensor

Kernel Gram matrix.