Warning
This documentation is workinprogress and unorganized.
Nearestneighbor queries:
KDTree – class for efficient nearestneighbor queries distance – module containing many different distance measures
kdtree for quick nearestneighbor lookup
This class provides an index into a set of kdimensional points which can be used to rapidly look up the nearest neighbors of any point.
The algorithm used is described in Maneewongvatana and Mount 1999. The general idea is that the kdtree is a binary tree, each of whose nodes represents an axisaligned hyperrectangle. Each node specifies an axis and splits the set of points based on whether their coordinate along that axis is greater than or less than a particular value.
During construction, the axis and splitting point are chosen by the “sliding midpoint” rule, which ensures that the cells do not all become long and thin.
The tree can be queried for the r closest neighbors of any given point (optionally returning only those within some maximum distance of the point). It can also be queried, with a substantial gain in efficiency, for the r approximate closest neighbors.
For large dimensions (20 is already large) do not expect this to run significantly faster than brute force. Highdimensional nearestneighbor queries are a substantial open problem in computer science.
The tree also supports allneighbors queries, both with arrays of points and with other kdtrees. These do use a reasonably efficient algorithm, but the kdtree is not necessarily the best data structure for this sort of calculation.
Methods
count_neighbors(other, r[, p])  Count how many nearby pairs can be formed. 
innernode  
leafnode  
node  
query(x[, k, eps, p, distance_upper_bound])  query the kdtree for nearest neighbors 
query_ball_point(x, r[, p, eps])  Find all points within r of x 
query_ball_tree(other, r[, p, eps])  Find all pairs of points whose distance is at most r 
query_pairs(r[, p, eps])  Find all pairs of points whose distance is at most r 
sparse_distance_matrix(other, max_distance)  Compute a sparse distance matrix 
Count how many nearby pairs can be formed.
Count the number of pairs (x1,x2) can be formed, with x1 drawn from self and x2 drawn from other, and where distance(x1,x2,p)<=r. This is the “twopoint correlation” described in Gray and Moore 2000, “Nbody problems in statistical learning”, and the code here is based on their algorithm.
Parameters:  other : KDTree r : float or onedimensional array of floats
p : float, 1<=p<=infinity


Returns:  result : integer or onedimensional array of integers

query the kdtree for nearest neighbors
Parameters:  x : arraylike, last dimension self.m
k : integer
eps : nonnegative float
p : float, 1<=p<=infinity
distance_upper_bound : nonnegative float


Returns:  d : array of floats
i : array of integers

Examples
>>> from scipy.spatial import KDTree
>>> x, y = np.mgrid[0:5, 2:8]
>>> tree = KDTree(zip(x.ravel(), y.ravel()))
>>> tree.data
array([[0, 2],
[0, 3],
[0, 4],
[0, 5],
[0, 6],
[0, 7],
[1, 2],
[1, 3],
[1, 4],
[1, 5],
[1, 6],
[1, 7],
[2, 2],
[2, 3],
[2, 4],
[2, 5],
[2, 6],
[2, 7],
[3, 2],
[3, 3],
[3, 4],
[3, 5],
[3, 6],
[3, 7],
[4, 2],
[4, 3],
[4, 4],
[4, 5],
[4, 6],
[4, 7]])
>>> pts = np.array([[0, 0], [2.1, 2.9]])
>>> tree.query(pts)
(array([ 2. , 0.14142136]), array([ 0, 13]))
Find all points within r of x
Parameters:  x : array_like, shape tuple + (self.m,)
r : positive float
p : float 1<=p<=infinity
eps : nonnegative float


Returns:  results : list or array of lists
Note: if you have many points whose neighbors you want to find, you may save : substantial amounts of time by putting them in a KDTree and using query_ball_tree. : 
Find all pairs of points whose distance is at most r
Parameters:  other : KDTree
r : positive float
p : float 1<=p<=infinity
eps : nonnegative float


Returns:  results : list of lists

Find all pairs of points whose distance is at most r
Parameters:  r : positive float
p : float 1<=p<=infinity
eps : nonnegative float


Returns:  results : set

Compute a sparse distance matrix
Computes a distance matrix between two KDTrees, leaving as zero any distance greater than max_distance.
Parameters:  other : KDTree max_distance : positive float 

Returns:  result : dok_matrix

Hyperrectangle class.
Represents a Cartesian product of intervals.
Methods
max_distance_point(x[, p])  Compute the maximum distance between x and a point in the hyperrectangle. 
max_distance_rectangle(other[, p])  Compute the maximum distance between points in the two hyperrectangles. 
min_distance_point(x[, p])  Compute the minimum distance between x and a point in the hyperrectangle. 
min_distance_rectangle(other[, p])  Compute the minimum distance between points in the two hyperrectangles. 
split(d, split)  Produce two hyperrectangles by splitting along axis d. 
volume()  Total volume. 
Produce two hyperrectangles by splitting along axis d.
In general, if you need to compute maximum and minimum distances to the children, it can be done more efficiently by updating the maximum and minimum distances to the parent.
kdtree for quick nearestneighbor lookup
This class provides an index into a set of kdimensional points which can be used to rapidly look up the nearest neighbors of any point.
The algorithm used is described in Maneewongvatana and Mount 1999. The general idea is that the kdtree is a binary trie, each of whose nodes represents an axisaligned hyperrectangle. Each node specifies an axis and splits the set of points based on whether their coordinate along that axis is greater than or less than a particular value.
During construction, the axis and splitting point are chosen by the “sliding midpoint” rule, which ensures that the cells do not all become long and thin.
The tree can be queried for the r closest neighbors of any given point (optionally returning only those within some maximum distance of the point). It can also be queried, with a substantial gain in efficiency, for the r approximate closest neighbors.
For large dimensions (20 is already large) do not expect this to run significantly faster than brute force. Highdimensional nearestneighbor queries are a substantial open problem in computer science.
Methods
query  query the kdtree for nearest neighbors 
Compute the distance matrix.
Computes the matrix of all pairwise distances.
Parameters:  x : arraylike, m by k y : arraylike, n by k p : float 1<=p<=infinity
threshold : positive integer


Returns:  result : arraylike, m by n 
Compute the pth power of the L**p distance between x and y
For efficiency, this function computes the L**p distance but does not extract the pth root. If p is 1 or infinity, this is equal to the actual L**p distance.