skhubness.reduction.DisSimLocal¶

class
skhubness.reduction.
DisSimLocal
(k: int = 5, squared: bool = True, *args, **kwargs)[source]¶ Hubness reduction with DisSimLocal [1].
 Parameters
 k: int, default = 5
Number of neighbors to consider for the local centroids
 squared: bool, default = True
DisSimLocal operates on squared Euclidean distances. If True, return (quasi) squared Euclidean distances; if False, return (quasi) Eucldean distances.
References
 1
Hara K, Suzuki I, Kobayashi K, Fukumizu K, Radovanović M (2016) Flattening the density gradient for eliminating spatial centrality to reduce hubness. In: Proceedings of the 30th AAAI conference on artificial intelligence, pp 1659–1665. https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/viewPaper/12055

__init__
(k: int = 5, squared: bool = True, *args, **kwargs)[source]¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
([k, squared])Initialize self.
fit
(neigh_dist, neigh_ind, X[, assume_sorted])Fit the model using X, neigh_dist, and neigh_ind as training data.
fit_transform
(neigh_dist, neigh_ind, X[, …])Equivalent to call .fit().transform()
transform
(neigh_dist, neigh_ind, X[, …])Transform distance between test and training data with DisSimLocal.

fit
(neigh_dist: numpy.ndarray, neigh_ind: numpy.ndarray, X: numpy.ndarray, assume_sorted: bool = True, *args, **kwargs) → skhubness.reduction.dis_sim.DisSimLocal[source]¶ Fit the model using X, neigh_dist, and neigh_ind as training data.
 Parameters
 neigh_dist: np.ndarray, shape (n_samples, n_neighbors)
Distance matrix of training objects (rows) against their individual k nearest neighbors (colums).
 neigh_ind: np.ndarray, shape (n_samples, n_neighbors)
Neighbor indices corresponding to the values in neigh_dist.
 X: np.ndarray, shape (n_samples, n_features)
Training data, where n_samples is the number of vectors, and n_features their dimensionality (number of features).
 assume_sorted: bool, default = True
Assume input matrices are sorted according to neigh_dist. If False, these are sorted here.

fit_transform
(neigh_dist, neigh_ind, X, assume_sorted=True, return_distance=True, *args, **kwargs)[source]¶ Equivalent to call .fit().transform()

transform
(neigh_dist: np.ndarray, neigh_ind: np.ndarray, X: np.ndarray, assume_sorted: bool = True, *args, **kwargs)[source]¶ Transform distance between test and training data with DisSimLocal.
 Parameters
 neigh_dist: np.ndarray, shape (n_query, n_neighbors)
Distance matrix of test objects (rows) against their individual k nearest neighbors among the training data (columns).
 neigh_ind: np.ndarray, shape (n_query, n_neighbors)
Neighbor indices corresponding to the values in neigh_dist
 X: np.ndarray, shape (n_query, n_features)
Test data, where n_query is the number of vectors, and n_features their dimensionality (number of features).
 assume_sorted: ignored
 Returns
 hub_reduced_dist, neigh_ind
DisSimLocal distances, and corresponding neighbor indices
Notes
The returned distances are NOT sorted! If you use this class directly, you will need to sort the returned matrices according to hub_reduced_dist. Classes from
skhubness.neighbors
do this automatically.