skhubness.reduction.DisSimLocal¶
-
class
skhubness.reduction.
DisSimLocal
(k: int = 5, squared: bool = True, *args, **kwargs)[source]¶ Hubness reduction with DisSimLocal [1].
- Parameters
- k: int, default = 5
Number of neighbors to consider for the local centroids
- squared: bool, default = True
DisSimLocal operates on squared Euclidean distances. If True, return (quasi) squared Euclidean distances; if False, return (quasi) Eucldean distances.
References
- 1
Hara K, Suzuki I, Kobayashi K, Fukumizu K, Radovanović M (2016) Flattening the density gradient for eliminating spatial centrality to reduce hubness. In: Proceedings of the 30th AAAI conference on artificial intelligence, pp 1659–1665. https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/viewPaper/12055
-
__init__
(k: int = 5, squared: bool = True, *args, **kwargs)[source]¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
([k, squared])Initialize self.
fit
(neigh_dist, neigh_ind, X[, assume_sorted])Fit the model using X, neigh_dist, and neigh_ind as training data.
fit_transform
(neigh_dist, neigh_ind, X[, …])Equivalent to call .fit().transform()
transform
(neigh_dist, neigh_ind, X[, …])Transform distance between test and training data with DisSimLocal.
-
fit
(neigh_dist: numpy.ndarray, neigh_ind: numpy.ndarray, X: numpy.ndarray, assume_sorted: bool = True, *args, **kwargs) → skhubness.reduction.dis_sim.DisSimLocal[source]¶ Fit the model using X, neigh_dist, and neigh_ind as training data.
- Parameters
- neigh_dist: np.ndarray, shape (n_samples, n_neighbors)
Distance matrix of training objects (rows) against their individual k nearest neighbors (colums).
- neigh_ind: np.ndarray, shape (n_samples, n_neighbors)
Neighbor indices corresponding to the values in neigh_dist.
- X: np.ndarray, shape (n_samples, n_features)
Training data, where n_samples is the number of vectors, and n_features their dimensionality (number of features).
- assume_sorted: bool, default = True
Assume input matrices are sorted according to neigh_dist. If False, these are sorted here.
-
fit_transform
(neigh_dist, neigh_ind, X, assume_sorted=True, return_distance=True, *args, **kwargs)[source]¶ Equivalent to call .fit().transform()
-
transform
(neigh_dist: np.ndarray, neigh_ind: np.ndarray, X: np.ndarray, assume_sorted: bool = True, *args, **kwargs)[source]¶ Transform distance between test and training data with DisSimLocal.
- Parameters
- neigh_dist: np.ndarray, shape (n_query, n_neighbors)
Distance matrix of test objects (rows) against their individual k nearest neighbors among the training data (columns).
- neigh_ind: np.ndarray, shape (n_query, n_neighbors)
Neighbor indices corresponding to the values in neigh_dist
- X: np.ndarray, shape (n_query, n_features)
Test data, where n_query is the number of vectors, and n_features their dimensionality (number of features).
- assume_sorted: ignored
- Returns
- hub_reduced_dist, neigh_ind
DisSimLocal distances, and corresponding neighbor indices
Notes
The returned distances are NOT sorted! If you use this class directly, you will need to sort the returned matrices according to hub_reduced_dist. Classes from
skhubness.neighbors
do this automatically.