sigmoid_kernel#

hypercoil.functional.kernel.sigmoid_kernel(X0: Tensor, X1: Tensor | None = None, theta: Tensor | None = None, gamma: float | None = None, r: float = 0) Tensor[source]#

Parameterised sigmoid kernel between input tensors.

For tensors \(X_0\) and \(X_1\) containing features in column vectors, the parameterised sigmoid kernel is

\(K_{\theta}(X_0, X_1) = \tanh (\gamma X_0^\intercal \theta X_1 + r)\)

where \(\theta\) is the kernel parameter, and \(\gamma\) and r are scaling and offset coefficients.

Note

The inputs here are assumed to contain features in row vectors and observations in columns. This differs from the convention frequently used in the literature. However, this has the benefit of direct compatibility with the top-k sparse tensor format.

Dimension:
X0 : \((*, N, P)\) or \((N, P, *)\)

N denotes number of observations, P denotes number of features, * denotes any number of additional dimensions. If the input is dense, then the last dimensions should be N and P; if it is sparse, then the first dimensions should be N and P.

X1 : \((*, M, P)\) or \((M, P, *)\)

M denotes number of observations.

theta : \((*, P, P)\) or \((*, P)\)

As above.

Output : \((*, M, N)\) or \((M, N, *)\)

As above.

Parameters:
X0tensor

A feature tensor.

X1tensor or None

Second feature tensor. If not explicitly provided, the kernel of X with itself is computed.

thetatensor or None

Kernel parameter (generally a representation of a positive definite matrix). If not provided, defaults to identity (an unparameterised kernel). If the last two dimensions are the same size, they are used as a matrix parameter; if they are not, the final axis is instead used as the diagonal of the matrix.

gammafloat or None (default None)

Scaling coefficient. If not explicitly specified, this is automatically set to \(\frac{1}{P}\).

rfloat (default 0)

Offset coefficient.

Returns:
tensor

Kernel Gram matrix.