Quick Start¶
Get up and running with kmeanssa-ng in 5 minutes! This guide will walk you through your first clustering example.
Basic Example¶
Let's cluster points on a simple quantum graph with two natural clusters:
from kmeanssa_ng import generate_sbm, SimulatedAnnealing
# Step 1: Create a quantum graph with block structure
graph = generate_sbm(
sizes=[30, 30], # Two clusters of 30 nodes each
p=[[0.8, 0.1], # 80% intra-cluster edges, 10% inter-cluster
[0.1, 0.8]] # 10% inter-cluster, 80% intra-cluster
)
# Step 2: Precompute distances (important for performance!)
graph.precomputing()
# Step 3: Sample random points on the graph
points = graph.sample_points(100) # 100 points distributed on edges
# Step 4: Run k-means clustering
sa = SimulatedAnnealing(
points=points,
k=2, # Number of clusters
lambda_param=1, # Temperature parameter
beta=1.0, # Drift strength
step_size=0.1 # Step size for random walks
)
# Step 5: Get cluster centers
centers = sa.run(
robust_prop=0.1, # Use last 10% iterations for averaging
initialization="kpp", # k-means++ initialization
algorithm_version="v1" # Algorithm variant
)
# Step 6: Assign points to clusters
graph.compute_clusters(centers)
print(f"Found {len(centers)} cluster centers")
print(f"Centers located at: {[str(c) for c in centers]}")
Understanding the Results¶
The algorithm returns centers - a list of QGCenter objects representing the optimal cluster centers. Each center has:
edge: The edge where the center is locatedposition: Position on the edge (between 0 and 1)
for i, center in enumerate(centers):
print(f"Cluster {i+1}: {center}")
# Output: Cluster 1: QGCenter(edge=(5, 8), position=0.342)
Customizing the Algorithm¶
Algorithm Parameters¶
# More exploration (higher temperature)
sa = SimulatedAnnealing(points, k=2, lambda_param=2, beta=0.5)
# More exploitation (stronger drift)
sa = SimulatedAnnealing(points, k=2, lambda_param=0.5, beta=2.0)
# Smaller steps (more precise)
sa = SimulatedAnnealing(points, k=2, step_size=0.05)
Different Graph Types¶
Quantum Graph Specific Features¶
For quantum graphs, you can use specialized methods:
from kmeanssa_ng import QGSimulatedAnnealing
# Get node-based results (useful for interpretation)
qg_sa = QGSimulatedAnnealing(points, k=2)
node_ids = qg_sa.run_for_kmeans(robust_prop=0.1)
print(f"Cluster centers near nodes: {node_ids}")
# For k=1 (finding the mean)
mean_sa = QGSimulatedAnnealing(points, k=1)
mean_node = mean_sa.run_for_mean(robust_prop=0.1)
print(f"Mean center near node: {mean_node}")
Performance Tips¶
Speed Up Your Clustering
- Always call
precomputing()- This caches shortest paths - Use appropriate
robust_prop- 0.1 (10%) is usually sufficient - Start with k-means++ - Set
initialization="kpp" - Tune parameters gradually - Start with defaults, then adjust
# Fast setup for large graphs
graph.precomputing() # Essential!
sa = SimulatedAnnealing(points, k=3, step_size=0.2) # Larger steps
centers = sa.run(robust_prop=0.05, initialization="kpp") # Less robustification
Next Steps¶
Now that you've run your first example:
- Core Concepts - Understand quantum graphs and simulated annealing
- User Guide - Deep dive into quantum graphs
- Examples - More practical examples
- API Reference - Complete API documentation
Common Patterns¶
Batch Processing¶
# Process multiple graphs
results = []
for graph_params in parameter_sets:
graph = generate_sbm(**graph_params)
graph.precomputing()
points = graph.sample_points(100)
sa = SimulatedAnnealing(points, k=2)
centers = sa.run(initialization="kpp")
results.append(centers)
Parameter Optimization¶
# Test different parameters
best_energy = float('inf')
best_params = None
for lambda_param in [0.5, 1.0, 2.0]:
for beta in [0.5, 1.0, 2.0]:
sa = SimulatedAnnealing(points, k=2,
lambda_param=lambda_param, beta=beta)
centers = sa.run()
energy = sa._compute_energy(centers)
if energy < best_energy:
best_energy = energy
best_params = (lambda_param, beta)
print(f"Best parameters: λ={best_params[0]}, β={best_params[1]}")