scikit-hubness: high-dimensional data mining

scikit-hubness is a Python package for analysis of hubness in high-dimensional data. It provides hubness reduction and approximate nearest neighbor search via a drop-in replacement for sklearn.neighbors.

Getting started

Get started with scikit-hubness in a breeze. Find how to install the package and see all core functionality applied in a single quick start example.

User Guide

The User Guide introduces the main concepts of scikit-hubness. It explains, how to analyze your data sets for hubness, and how to use the package to lift this curse of dimensionality. You will also find examples how to use skhubness.neighbors for approximate nearest neighbor search (with or without hubness reduction).

API Documentation

The API Documentation provides detailed information of the implemented methods. This information includes method descriptions, parameters, references, examples, etc. Find all the information about specific modules and functions of scikit-hubness in this section.

History

A brief history of the package, and how it relates to the Hub-Toolbox’es.

Development

There are several possibilities to contribute to this free open source software. We highly appreciate all input from the community, be it bug reports or code contributions.

Source code, issue tracking, discussion, and continuous integration appear on our GitHub page.

What’s new

To see what’s new in the latest version of scikit-hubness, have a look at the changelog.