DiffusionMap#

class skfda.preprocessing.dim_reduction.DiffusionMap(*, n_components=2, kernel, alpha=0, n_steps=1)[source]#

Functional diffusion maps.

Class that implements functional diffusion maps [1] for both basis and grid representations of the data.

Note

Performing fit and transform actions is not equivalent to performing fit_transform. In the former case an approximation of the diffusion coordinates is computed via the Nyström method. In the latter, the true diffusion coordinates are computed.

Parameters:
  • n_components (int) – Dimension of the space where the embedded functional data belongs to. For visualization of the data purposes, a value of 2 or 3 shall be used.

  • kernel (KernelFunction) – kernel function used over the functional observations. It serves as a measure of connectivity or similitude between points, where higher value means greater connectivity.

  • alpha (float) – density parameter in the interval [0, 1] used in the normalization step. A value of 0 means the data distribution is not taken into account during the normalization step. The opposite holds for a higher value of alpha.

  • n_steps (int) – Number of steps in the random walk.

Attributes:
  • transition_matrix_ – trasition matrix computed from the data.

  • eigenvalues_ – highest n_components eigenvalues of transition_matrix_ in descending order starting from the second highest.

  • eigenvectors_right_ – right eigenvectors of transition_matrix_ corresponding to eigenvalues_.

  • d_alpha_ – vector of densities of the weigthed graph.

  • training_dataset_ – dataset used for training the method. It is needed

  • for the transform method.

Examples

In this example we fetch the Canadian weather dataset and divide it into train and test sets. We then obtain the diffusion coordinates for the train set and predict these coordinates for the test set.

>>> from skfda.datasets import fetch_weather
>>> from skfda.representation import FDataGrid
>>> from skfda.misc.covariances import Gaussian
>>> X, y = fetch_weather(return_X_y=True, as_frame=True)
>>> fd : FDataGrid = X.iloc[:, 0].values
>>> fd_train = fd[:25]
>>> fd_test = fd[25:]
>>> fdm = DiffusionMap(
...     n_components=2,
...     kernel=Gaussian(variance=1, length_scale=1),
...     alpha=1,
...     n_steps=1
... )
>>> embedding_train = fdm.fit_transform(X=fd_train)
>>> embedding_test = fdm.transform(X=fd_test)

References

Methods

fit(X[, y])

Compute the transition matrix and save it.

fit_transform(X[, y])

Compute the diffusion coordinates for the functional data X.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

set_output(*[, transform])

Set output container.

set_params(**params)

Set the parameters of this estimator.

transform(X)

Compute the diffusion coordinates for the functional data X.

fit(X, y=None)[source]#

Compute the transition matrix and save it.

Parameters:
  • X (FData) – Functional data for which to obtain diffusion coordinates.

  • y (object) – Ignored.

Returns:

self

Return type:

Self

fit_transform(X, y=None)[source]#

Compute the diffusion coordinates for the functional data X.

The diffusion coordinate corresponding to the i-th curve is stored in the i-th row of the returning matrix. Note that fit_transform is not equivalent to applying fit and transform.

Parameters:
  • X (FData) – Functional data for which to obtain diffusion coordinates.

  • y (object) – Ignored.

Returns:

Diffusion coordinates for the functional data X.

Return type:

ndarray[tuple[int, …], dtype[floating[Any]]]

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routing – A MetadataRequest encapsulating routing information.

Return type:

MetadataRequest

get_params(deep=True)#

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

set_output(*, transform=None)#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:

transform ({"default", "pandas", "polars"}, default=None) –

Configure output of transform and fit_transform.

  • ”default”: Default output format of a transformer

  • ”pandas”: DataFrame output

  • ”polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:

self – Estimator instance.

Return type:

estimator instance

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

estimator instance

transform(X)[source]#

Compute the diffusion coordinates for the functional data X.

Compute the diffusion coordinates of out-of-sample data using the eigenvectors and eigenvalues computed during the training.

Parameters:

X (FData) – Functional data for which to predict diffusion coordinates.

Returns:

Diffusion coordinates for the functional data X_out.

Return type:

ndarray[tuple[int, …], dtype[floating[Any]]]

Examples using skfda.preprocessing.dim_reduction.DiffusionMap#

Functional Diffusion Maps

Functional Diffusion Maps