skhubness.reduction.DisSimLocal¶

class skhubness.reduction.DisSimLocal(k: int = 5, squared: bool = True, *args, **kwargs)[source]¶

Hubness reduction with DisSimLocal [1].

Parameters

k: int, default = 5: Number of neighbors to consider for the local centroids
squared: bool, default = True: DisSimLocal operates on squared Euclidean distances. If True, return (quasi) squared Euclidean distances; if False, return (quasi) Eucldean distances.

References

1: Hara K, Suzuki I, Kobayashi K, Fukumizu K, Radovanović M (2016) Flattening the density gradient for eliminating spatial centrality to reduce hubness. In: Proceedings of the 30th AAAI conference on artificial intelligence, pp 1659–1665. https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/viewPaper/12055

__init__(k: int = 5, squared: bool = True, *args, **kwargs)[source]¶: Initialize self. See help(type(self)) for accurate signature.

Methods

`__init__`([k, squared])	Initialize self.
`fit`(neigh_dist, neigh_ind, X[, assume_sorted])	Fit the model using X, neigh_dist, and neigh_ind as training data.
`fit_transform`(neigh_dist, neigh_ind, X[, …])	Equivalent to call .fit().transform()
`transform`(neigh_dist, neigh_ind, X[, …])	Transform distance between test and training data with DisSimLocal.

fit(neigh_dist: numpy.ndarray, neigh_ind: numpy.ndarray, X: numpy.ndarray, assume_sorted: bool = True, *args, **kwargs) → skhubness.reduction.dis_sim.DisSimLocal[source]¶

Fit the model using X, neigh_dist, and neigh_ind as training data.

Parameters

neigh_dist: np.ndarray, shape (n_samples, n_neighbors): Distance matrix of training objects (rows) against their individual k nearest neighbors (colums).
neigh_ind: np.ndarray, shape (n_samples, n_neighbors): Neighbor indices corresponding to the values in neigh_dist.
X: np.ndarray, shape (n_samples, n_features): Training data, where n_samples is the number of vectors, and n_features their dimensionality (number of features).
assume_sorted: bool, default = True: Assume input matrices are sorted according to neigh_dist. If False, these are sorted here.

fit_transform(neigh_dist, neigh_ind, X, assume_sorted=True, return_distance=True, *args, **kwargs)[source]¶: Equivalent to call .fit().transform()

transform(neigh_dist: np.ndarray, neigh_ind: np.ndarray, X: np.ndarray, assume_sorted: bool = True, *args, **kwargs)[source]¶

Transform distance between test and training data with DisSimLocal.

Parameters

neigh_dist: np.ndarray, shape (n_query, n_neighbors): Distance matrix of test objects (rows) against their individual k nearest neighbors among the training data (columns).
neigh_ind: np.ndarray, shape (n_query, n_neighbors): Neighbor indices corresponding to the values in neigh_dist
X: np.ndarray, shape (n_query, n_features): Test data, where n_query is the number of vectors, and n_features their dimensionality (number of features).
assume_sorted: ignored

Returns

hub_reduced_dist, neigh_ind: DisSimLocal distances, and corresponding neighbor indices

Notes

The returned distances are NOT sorted! If you use this class directly, you will need to sort the returned matrices according to hub_reduced_dist. Classes from skhubness.neighbors do this automatically.