algorithm – Comparing 8 different Disjoint-Set data structure variants in Java

The Wikipedia page on Disjoint-Set data structures presents $4$ distinct algorithms for finding the root node of the tree, and $2$ distinct algorithms for performing the union operation. In this post, I will present all $4 * 2$ combinations:

com.github.coderodde.util.disjointset.DisjointSet:

package com.github.coderodde.util.disjointset;

import java.util.HashMap;
import java.util.Map;
import java.util.Objects;

/**
 * This class implements the 
 * <a href="https://en.wikipedia.org/wiki/Disjoint-set_data_structure">
 * disjoint-set data structure</a>.
 * 
 * @param <E> the satellite data type.
 *
 * @author Rodion "rodde" Efremov
 * @version 1.6 (Sep 5, 2021)
 * @since 1.6 (Sep 5, 2021).
 */
public final class DisjointSet<E> {

    private final Map<E, DisjointSetNode<E>> disjointSetMap = new HashMap<>();

    /**
     * The disjoint set root finder.
     */
    private final AbstractDisjointSetRootFinder<E> disjointSetRootFinder;

    /**
     * The disjoint set operation provider.
     */
    private final AbstractDisjointSetUnionComputer<E> disjointSetUnionComputer;

    /**
     * Constructs a disjoint-set data structure with specific operation 
     * implementations.
     * 
     * @param disjointSetRootFinder the root finder operation implementation 
     * object.
     * 
     * @param disjointSetUnionComputer the union operation implementation 
     * object.
     */
    public DisjointSet(AbstractDisjointSetRootFinder<E> disjointSetRootFinder,
                       AbstractDisjointSetUnionComputer<E> disjointSetUnionComputer) {

        this.disjointSetRootFinder = 
                Objects.requireNonNull(
                        disjointSetRootFinder, 
                        "The input DisjointSetRootFinder is null.");

        this.disjointSetUnionComputer = 
                Objects.requireNonNull(
                        disjointSetUnionComputer, 
                        "The input DisjointSetUnionComputer is null.");

        this.disjointSetRootFinder.ownerDisjointSet = this;
        this.disjointSetUnionComputer.ownerDisjointSet = this;
    }

    /**
     * Finds the root of the tree to which {@code item} belongs to.
     * 
     * @param item the target item.
     * @return the root of the tree to which {@code item} belongs to.
     */
    public E find(E item) {
        return disjointSetRootFinder.find(item);
    }

    /**
     * Unites the two trees into one, unless the two items already belong to the
     * same tree.
     * 
     * @param item1 the first item.
     * @param item2 the second item.
     */
    public void union(E item1, E item2) {
        disjointSetUnionComputer.union(item1, item2);
    }

    DisjointSetNode<E> find(DisjointSetNode<E> node) {
        if (node == node.getParent()) {
            return node;
        }

        return find(node.getParent());
    }

    DisjointSetNode<E> getNode(E item) {
        DisjointSetNode<E> node = disjointSetMap.get(item);

        if (node == null) {
            node = new DisjointSetNode<E>(item);
            disjointSetMap.put(item, node);
        }

        return node;
    }
}

com.github.coderodde.util.disjointset.DisjointSetNode:

package com.github.coderodde.util.disjointset;

/**
 * This class implements the nodes of the disjoint-set data structures.
 * 
 * @param <E> the satellite data type.
 *
 * @author Rodion "rodde" Efremov
 * @version 1.6 (Sep 4, 2021)
 * @since 1.6 (Sep 4, 2021)
 */
final class DisjointSetNode<E> {

    /**
     * The actual item being held.
     */
    private final E item;

    /**
     * The parent node of this node.
     */
    private DisjointSetNode<E> parent;

    /**
     * The rank of this node. The rank of a node is its maximum height from any
     * leaf node.
     */
    private int rank;

    /**
     * The size of this node. The size of a node is the number of nodes under
     * it.
     */
    private int size;

    public DisjointSetNode(E item) {
        this.item = item;;
        this.parent = this; 
    }

    public E getItem() {
        return item;
    }

    public DisjointSetNode<E> getParent() {
        return parent;
    }

    public int getRank() {
        return rank;
    }

    public int getSize() {
        return size;
    }

    @Override
    public String toString() {
        return "(" + item + ")";
    }

    void setParent(DisjointSetNode<E> parent) {
        this.parent = parent;
    }

    void setRank(int rank) {
        this.rank = rank;
    }

    void setSize(int size) {
        this.size = size;
    }
}

com.github.coderodde.util.disjointset.AbstractDisjointSetRootFinder:

package com.github.coderodde.util.disjointset;

/**
 * This abstract class defines the API for for root finding algorithms in a 
 * disjoint-set data structure.
 * 
 * @param <E> the satellite data type.
 * 
 * @author Rodion "rodde" Efremov
 * @version 1.6 (Sep 5, 2021)
 * @since 1.6 (Sep 5, 2021)
 */
public abstract class AbstractDisjointSetRootFinder<E> {

    DisjointSet<E> ownerDisjointSet;

    /**
     * If {@code item} belongs to a tree, returns the root of that three. 
     * Otherwise, a trivial, empty tree {@code T} is created, and {@code item} 
     * is added to {@code T}.
     * 
     * @param item the query item.
     * @return the root of the tree to which {@code item} belongs to.
     */
    public abstract E find(E item);
}

com.github.coderodde.util.disjointset.AbstractDisjointSetUnionComputer:

package com.github.coderodde.util.disjointset;

/**
 * This abstract class defines the API for the union computer algorithms in a 
 * disjoint-set data structure.
 * 
 * @param <E> the satellite date type.
 *
 * @author Rodion "rodde" Efremov
 * @version 1.6 (Sep 5, 2021)
 * @since 1.6 (Sep 5, 2021)
 */
public abstract class AbstractDisjointSetUnionComputer<E> {

    DisjointSet<E> ownerDisjointSet;

    /**
     * If both {@code item1} and {@code item2} belong to the same tree, does 
     * nothing. Otherwise, unites the two into a single tree.
     * 
     * @param item1 the first item.
     * @param item2 the second item;
     */
    public abstract void union(E item1, E item2);
}

com.github.coderodde.util.disjointset.DisjointSetIterativePathCompressionNodeFinder:

package com.github.coderodde.util.disjointset;

/**
 * This class implements the root finding algorithm for the disjoint-set data
 * structure using iterative path compression.
 * 
 * @param <E> the satellite data type.
 * 
 * @author Rodion "rodde" Efremov
 * @version 1.6 (Sep 5, 2021)
 * @since 1.6 (Sep 5, 2021)
 */
public final class DisjointSetIterativePathCompressionNodeFinder<E>
extends AbstractDisjointSetRootFinder<E> {

    /**
     * {@inheritDoc } 
     */
    @Override
    public E find(E item) {
        DisjointSetNode<E> node = 
                ownerDisjointSet.find(ownerDisjointSet.getNode(item));

        DisjointSetNode<E> root = node;

        while (root.getParent() != root) {
            root = root.getParent();
        }

        while (node.getParent() != root) {
            DisjointSetNode<E> parent = node.getParent();
            node.setParent(root);
            node = parent;
        }

        return root.getItem();
    }
}

com.github.coderodde.util.disjointset.DisjointSetRecursivePathCompressionNodeFinder:

package com.github.coderodde.util.disjointset;

/**
 * This class implements the root finding algorithm for the disjoint-set data
 * structure using recursive path compression.
 *
 * @param <E> the satellite data type.
 *
 * @author Rodion "rodde" Efremov
 * @version 1.6 (Sep 5, 2021)
 * @since 1.6 (Sep 5, 2021)
 */
public final class DisjointSetRecursivePathCompressionNodeFinder<E>
extends AbstractDisjointSetRootFinder<E> {

    /**
     * {@inheritDoc }
     */
    @Override
    public E find(E item) {
        DisjointSetNode<E> node = 
                ownerDisjointSet.find(ownerDisjointSet.getNode(item));

        if (node == node.getParent()) {
            return node.getItem();
        }

        node.setParent(ownerDisjointSet.find(node.getParent()));
        return node.getParent().getItem();
    }
}

com.github.coderodde.util.disjointset.DisjointSetPathHalvingNodeFinder:

package com.github.coderodde.util.disjointset;

/**
 * This class implements the root finding algorithm for the disjoint-set data
 * structure using path halving.
 * 
 * @param <E> the satellite data type.
 * 
 * @author Rodion "rodde" Efremov
 * @version 1.6 (Sep 5, 2021)
 * @since 1.6 (Sep 5, 2021)
 */
public final class DisjointSetPathHalvingNodeFinder<E>
extends AbstractDisjointSetRootFinder<E> {

    /**
     * {@inheritDoc }
     */
    @Override
    public E find(E item) {
        DisjointSetNode<E> node =
                ownerDisjointSet.find(ownerDisjointSet.getNode(item));

        while (node.getParent() != node) {
            node.setParent(node.getParent().getParent());
            node = node.getParent();
        }

        return node.getItem();
    }
}

com.github.coderodde.util.disjointset.DisjointSetPathSplittingNodeFinder:

package com.github.coderodde.util.disjointset;

/**
 *
 * This class implements the root finding algorithm for the disjoint-set data
 * structure using path splitting.
 * 
 * @param <E> the satellite data type.
 * 
 * @author Rodion "rodde" Efremov
 * @version 1.6 (Sep 5, 2021)
 * @since 1.6 (Sep 5, 2021)
 */
public final class DisjointSetPathSplittingNodeFinder<E>
extends AbstractDisjointSetRootFinder<E> {

    @Override
    public E find(E item) {
        DisjointSetNode<E> node =
                ownerDisjointSet.find(ownerDisjointSet.getNode(item));

        while (node.getParent() != node) {
            DisjointSetNode<E> parent = node.getParent();
            node.setParent(parent.getParent());
            node = parent;
        }

        return node.getItem();
    }
}

com.github.coderodde.util.disjointset.DisjointSetUnionByRankComputer:

package com.github.coderodde.util.disjointset;

/**
 * This class implements the union operation by rank.
 * 
 * @param <E> the satellite data type.
 * 
 * @author Rodion "rodde" Efremov
 * @version 1.6 (Sep 5, 2021)
 * @since 1.6 (Sep 5, 2021)
 */
public final class DisjointSetUnionByRankComputer<E> 
extends AbstractDisjointSetUnionComputer<E> {
    
    /**
     * {@inheritDoc }
     */
    @Override
    public void union(E item1, E item2) {
        DisjointSetNode<E> node1 = 
                ownerDisjointSet.find(ownerDisjointSet.getNode(item1));

        DisjointSetNode<E> node2 = 
                ownerDisjointSet.find(ownerDisjointSet.getNode(item2));

        if (node1 == node2) {
            return;
        }

        if (node1.getRank() < node2.getRank()) {
            DisjointSetNode<E> tempNode = node1;
            node1 = node2;
            node2 = tempNode;
        }

        node2.setParent(node1);

        if (node1.getRank() == node2.getRank()) {
            node1.setRank(node1.getRank() + 1);
        }
    }
}

com.github.coderodde.util.disjointset.DisjointSet:

package com.github.coderodde.util.disjointset;

/**
 * This class implements the union operation by tree size.
 * 
 * @param <E> the satellite data type.
 * 
 * @author Rodion "rodde" Efremov
 * @version 1.6 (Sep 5, 2021)
 * @since 1.6 (Sep 5, 2021)
 */
public final class DisjointSetUnionBySizeComputer<E>
extends AbstractDisjointSetUnionComputer<E> {

    /**
     * {@inheritDoc }
     */
    @Override
    public void union(E item1, E item2) {
        DisjointSetNode<E> node1 = 
                ownerDisjointSet.find(ownerDisjointSet.getNode(item1));

        DisjointSetNode<E> node2 = 
                ownerDisjointSet.find(ownerDisjointSet.getNode(item2));

        if (node1 == node2) {
            return;
        }

        if (node1.getSize() < node2.getSize()) {
            DisjointSetNode<E> tempNode = node1;
            node1 = node2;
            node2 = tempNode;
        }
        // Here, node1.getSize() >= node2.getSize()
        node2.setParent(node1);
        node1.setSize(node1.getSize() + node2.getSize());
    }
}

Repository

The entire story is here.

The typical benchmark output might be:

Seed: 1630845583522
The benchmark graph is built in 222 milliseconds.

................................................................................
Root finder: DisjointSetRecursivePathCompressionNodeFinder, union computer: DisjointSetUnionByRankComputer
Duration: 149 ms. Total edges: 199997. Total weight: 36417.94367839121.
................................................................................
Root finder: DisjointSetRecursivePathCompressionNodeFinder, union computer: DisjointSetUnionBySizeComputer
Duration: 4670 ms. Total edges: 199997. Total weight: 36417.94367839121.
................................................................................
Root finder: DisjointSetIterativePathCompressionNodeFinder, union computer: DisjointSetUnionByRankComputer
Duration: 153 ms. Total edges: 199997. Total weight: 36417.94367839121.
................................................................................
Root finder: DisjointSetIterativePathCompressionNodeFinder, union computer: DisjointSetUnionBySizeComputer
Duration: 4649 ms. Total edges: 199997. Total weight: 36417.94367839121.
................................................................................
Root finder: DisjointSetPathHalvingNodeFinder, union computer: DisjointSetUnionByRankComputer
Duration: 157 ms. Total edges: 199997. Total weight: 36417.94367839121.
................................................................................
Root finder: DisjointSetPathHalvingNodeFinder, union computer: DisjointSetUnionBySizeComputer
Duration: 4645 ms. Total edges: 199997. Total weight: 36417.94367839121.
................................................................................
Root finder: DisjointSetPathSplittingNodeFinder, union computer: DisjointSetUnionByRankComputer
Duration: 141 ms. Total edges: 199997. Total weight: 36417.94367839121.
................................................................................
Root finder: DisjointSetPathSplittingNodeFinder, union computer: DisjointSetUnionBySizeComputer
Duration: 4633 ms. Total edges: 199997. Total weight: 36417.94367839121.
................................................................................
1. Root finder: DisjointSetPathSplittingNodeFinder, union computer: DisjointSetUnionByRankComputer, 141 milliseconds.
2. Root finder: DisjointSetRecursivePathCompressionNodeFinder, union computer: DisjointSetUnionByRankComputer, 149 milliseconds.
3. Root finder: DisjointSetIterativePathCompressionNodeFinder, union computer: DisjointSetUnionByRankComputer, 153 milliseconds.
4. Root finder: DisjointSetPathHalvingNodeFinder, union computer: DisjointSetUnionByRankComputer, 157 milliseconds.
5. Root finder: DisjointSetPathSplittingNodeFinder, union computer: DisjointSetUnionBySizeComputer, 4633 milliseconds.
6. Root finder: DisjointSetPathHalvingNodeFinder, union computer: DisjointSetUnionBySizeComputer, 4645 milliseconds.
7. Root finder: DisjointSetIterativePathCompressionNodeFinder, union computer: DisjointSetUnionBySizeComputer, 4649 milliseconds.
8. Root finder: DisjointSetRecursivePathCompressionNodeFinder, union computer: DisjointSetUnionBySizeComputer, 4670 milliseconds.

Critique request

Please, tell me anything that comes to mind. In particular, is there any way to put the implementation classes into an implementation package (perhaps com.github.coderodde.util.disjointset.impl)?