skhubness.reduction.MutualProximity

class skhubness.reduction.MutualProximity(method: str = 'normal', verbose: int = 0, **kwargs)[source]

Hubness reduction with Mutual Proximity [1].

Parameters
method: ‘normal’ or ‘empiric’, default = ‘normal’

Model distance distribution with ‘method’.

  • ‘normal’ or ‘gaussi’ model distance distributions with independent Gaussians (fast)

  • ‘empiric’ or ‘exact’ model distances with the empiric distributions (slow)

verbose: int, default = 0

If verbose > 0, show progress bar.

References

1

Schnitzer, D., Flexer, A., Schedl, M., & Widmer, G. (2012). Local and global scaling reduce hubs in space. The Journal of Machine Learning Research, 13(1), 2871–2902.

__init__(method: str = 'normal', verbose: int = 0, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__([method, verbose])

Initialize self.

fit(neigh_dist, neigh_ind[, X, assume_sorted])

Fit the model using neigh_dist and neigh_ind as training data.

fit_transform(neigh_dist, neigh_ind, X[, …])

Equivalent to call .fit().transform()

transform(neigh_dist, neigh_ind[, X, …])

Transform distance between test and training data with Mutual Proximity.

fit(neigh_dist, neigh_ind, X=None, assume_sorted=None, *args, **kwargs) → skhubness.reduction.mutual_proximity.MutualProximity[source]

Fit the model using neigh_dist and neigh_ind as training data.

Parameters
neigh_dist: np.ndarray, shape (n_samples, n_neighbors)

Distance matrix of training objects (rows) against their individual k nearest neighbors (columns).

neigh_ind: np.ndarray, shape (n_samples, n_neighbors)

Neighbor indices corresponding to the values in neigh_dist.

X: ignored
assume_sorted: ignored
fit_transform(neigh_dist, neigh_ind, X, assume_sorted=True, return_distance=True, *args, **kwargs)[source]

Equivalent to call .fit().transform()

transform(neigh_dist, neigh_ind, X=None, assume_sorted=None, *args, **kwargs)[source]

Transform distance between test and training data with Mutual Proximity.

Parameters
neigh_dist: np.ndarray

Distance matrix of test objects (rows) against their individual k nearest neighbors among the training data (columns).

neigh_ind: np.ndarray

Neighbor indices corresponding to the values in neigh_dist

X: ignored
assume_sorted: ignored
Returns
hub_reduced_dist, neigh_ind

Mutual Proximity distances, and corresponding neighbor indices

Notes

The returned distances are NOT sorted! If you use this class directly, you will need to sort the returned matrices according to hub_reduced_dist. Classes from skhubness.neighbors do this automatically.