
class skhubness.reduction.DisSimLocal(k: int = 5, squared: bool = True, *args, **kwargs)[source]

Hubness reduction with DisSimLocal [1].

k: int, default = 5

Number of neighbors to consider for the local centroids

squared: bool, default = True

DisSimLocal operates on squared Euclidean distances. If True, return (quasi) squared Euclidean distances; if False, return (quasi) Eucldean distances.



Hara K, Suzuki I, Kobayashi K, Fukumizu K, Radovanović M (2016) Flattening the density gradient for eliminating spatial centrality to reduce hubness. In: Proceedings of the 30th AAAI conference on artificial intelligence, pp 1659–1665.

__init__(k: int = 5, squared: bool = True, *args, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.


__init__([k, squared])

Initialize self.

fit(neigh_dist, neigh_ind, X[, assume_sorted])

Fit the model using X, neigh_dist, and neigh_ind as training data.

fit_transform(neigh_dist, neigh_ind, X[, …])

Equivalent to call .fit().transform()

transform(neigh_dist, neigh_ind, X[, …])

Transform distance between test and training data with DisSimLocal.

fit(neigh_dist: numpy.ndarray, neigh_ind: numpy.ndarray, X: numpy.ndarray, assume_sorted: bool = True, *args, **kwargs) → skhubness.reduction.dis_sim.DisSimLocal[source]

Fit the model using X, neigh_dist, and neigh_ind as training data.

neigh_dist: np.ndarray, shape (n_samples, n_neighbors)

Distance matrix of training objects (rows) against their individual k nearest neighbors (colums).

neigh_ind: np.ndarray, shape (n_samples, n_neighbors)

Neighbor indices corresponding to the values in neigh_dist.

X: np.ndarray, shape (n_samples, n_features)

Training data, where n_samples is the number of vectors, and n_features their dimensionality (number of features).

assume_sorted: bool, default = True

Assume input matrices are sorted according to neigh_dist. If False, these are sorted here.

fit_transform(neigh_dist, neigh_ind, X, assume_sorted=True, return_distance=True, *args, **kwargs)[source]

Equivalent to call .fit().transform()

transform(neigh_dist: np.ndarray, neigh_ind: np.ndarray, X: np.ndarray, assume_sorted: bool = True, *args, **kwargs)[source]

Transform distance between test and training data with DisSimLocal.

neigh_dist: np.ndarray, shape (n_query, n_neighbors)

Distance matrix of test objects (rows) against their individual k nearest neighbors among the training data (columns).

neigh_ind: np.ndarray, shape (n_query, n_neighbors)

Neighbor indices corresponding to the values in neigh_dist

X: np.ndarray, shape (n_query, n_features)

Test data, where n_query is the number of vectors, and n_features their dimensionality (number of features).

assume_sorted: ignored
hub_reduced_dist, neigh_ind

DisSimLocal distances, and corresponding neighbor indices


The returned distances are NOT sorted! If you use this class directly, you will need to sort the returned matrices according to hub_reduced_dist. Classes from skhubness.neighbors do this automatically.