#####   ######  #####    ###    #   #  ###  #   #  ######
##  ##  ##      ##  ##  ## ##   #   #   #   #   #  ##
#####   ####    #####   #   #   #   #   #   #   #  ####
##  #   ##      ##      ## ##    # #    #    # #   ##
##   #  ######  ##       ###      #    ###    #    ######

$ curl repovive.com/algorithms/closest-pair

░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░███████████████████████████████████████████

#####   ######  #####    ###    #   #  ###  #   #  ######
##  ##  ##      ##  ##  ## ##   #   #   #   #   #  ##
#####   ####    #####   #   #   #   #   #   #   #  ####
##  #   ##      ##      ## ##    # #    #    # #   ##
##   #  ######  ##       ###      #    ###    #    ######

$ curl repovive.com/algorithms/closest-pair

Repovive

Closest Pair of Points - Algorithm Tutorial | Repovive | Repovive

Repovive.

Back to Algorithms

Closest Pair of Points Algorithm Tutorial

hardDivide and Conquer8 min read

Introduction

The Closest Pair of Points problem asks: given $n$ points in the plane, find the pair with the smallest distance.

Naive approach: Check all $\binom{n}{2}$ pairs. Time: $O(n^2)$ .

Better approach: Use divide and conquer to achieve $O(n \log n)$ time.

Why squared distance? Computing actual distance requires a square root, which introduces floating-point errors. By using squared distance, we stay in integers and get exact answers.

Applications:

Collision detection in games and simulations
Clustering algorithms (hierarchical clustering)
Geographic information systems
Computational geometry subroutines

Divide and Conquer Approach

The algorithm follows the divide and conquer paradigm:

Divide:

Sort points by x-coordinate
Split into left half and right half at the median x-coordinate

Conquer: 3. Recursively find closest pair in left half: $d_L$ 4. Recursively find closest pair in right half: $d_R$ 5. Let $d = \min(d_L, d_R)$

Combine: 6. The closest pair might cross the dividing line 7. Only points within distance $d$ of the dividing line could form such a pair 8. Check pairs in this "strip"

The key insight is that checking the strip takes $O(n)$ time, not $O(n^2)$ .

The Strip Case

Why is the strip check efficient?

Consider points in the strip sorted by y-coordinate. For each point $p$ , we only need to check points $q$ where $|q.y - p.y| < d$ .

Key lemma: In a $d \times 2d$ rectangle centered on the dividing line, there can be at most 8 points (at most 4 on each side).

This is because each side has minimum pairwise distance $d$ , so in a $d \times d$ square, at most 4 points can fit (one in each corner).

Implication: For each point in the strip, we check at most 7 subsequent points (by y-coordinate). So the strip check is $O(n)$ , not $O(n^2)$ .

text

function strip_closest(strip, d):
    sort strip by y-coordinate
    min_d = d
    for i from 0 to len(strip)-1:
        j = i + 1
        while j < len(strip) and (strip[j].y - strip[i].y)^2 < min_d:
            min_d = min(min_d, dist_sq(strip[i], strip[j]))
            j++
    return min_d

Complete Algorithm

text

function closest_pair(points):
    if len(points) <= 3:
        return brute_force(points)

    // Sort by x-coordinate
    sort points by x

    mid = len(points) / 2
    mid_x = points[mid].x

    // Divide
    left = points[0:mid]
    right = points[mid:]

    // Conquer
    d_left = closest_pair(left)
    d_right = closest_pair(right)
    d = min(d_left, d_right)

    // Build strip
    strip = []
    for p in points:
        if (p.x - mid_x)^2 < d:
            strip.append(p)

    // Combine
    d_strip = strip_closest(strip, d)

    return min(d, d_strip)

Base case: For 2 or 3 points, brute force is efficient.

Sorting: Initial sort is $O(n \log n)$ . Maintain sorted order during recursion or re-sort the strip (still $O(n \log n)$ overall).

Implement this Algorithm

Go to Problem

Complexity Analysis

Time Complexity:

Let $T(n)$ be the time for $n$ points.

$T(n) = 2T(n/2) + O(n)$

The $O(n)$ term comes from:

Building the strip: $O(n)$
Sorting strip by y: $O(n \log n)$ but can be $O(n)$ with merge
Checking strip: $O(n)$ (at most 7 comparisons per point)

By the Master Theorem: $T(n) = O(n \log n)$

Space Complexity: $O(n)$

Storing points
Recursion depth: $O(\log n)$
Strip array: $O(n)$

Comparison with naive:

Algorithm	Time	Space
Brute Force	$O(n^2)$	$O(1)$
Divide & Conquer	$O(n \log n)$	$O(n)$

For $n = 50000$ : brute force needs $\approx 1.25 \times 10^9$ operations, D&C needs $\approx 800000$ .