Engineering2026-03-3010 min read

How AI Clustering Turns 500 Feedback Items into 12 Actionable Groups

We use vector embeddings and cosine similarity to automatically group similar feedback. Here's how the pipeline works — from raw text to prioritized clusters with sentiment analysis.

The problem: feedback overload

Your app has 500 feedback items. Some are about dark mode. Others mention "night theme" or "dark color scheme." They're all the same request — but without reading every single one, you'd never know. Manual categorization doesn't scale.

Vector embeddings: turning text into numbers

When a user submits feedback, we send the text to Voyage AI(voyage-3-lite model) which returns a 1024-dimensional vector. Each dimension captures a semantic aspect of the text. "Add dark mode" and "night theme option" produce vectors that are very close in this high-dimensional space, even though the words are different.

We store these vectors in PostgreSQL using the pgvector extension. This lets us run similarity searches directly in the database — no external vector store needed.

Cosine similarity: finding related feedback

When new feedback arrives, we compute cosine similarity against all existing cluster centroids. If the similarity exceeds our threshold (0.82), the feedback joins that cluster. If not, a new cluster is created.

This threshold was tuned empirically. Too low (0.7) and unrelated feedback gets grouped. Too high (0.9) and obvious duplicates stay separate. 0.82 hits the sweet spot for product feedback.

Auto-naming with Claude

Once a cluster has 2+ items, we send the feedback titles to Claude haiku-4-5and ask for a concise cluster name and summary. The prompt is simple:

Given these feedback items, generate a short title (under 30 chars)
Write a 1-sentence summary of what users are asking for
Classify sentiment as positive, negative, or neutral

The result is stored and displayed immediately. Admins can edit the title and summary inline if the AI gets it wrong.

Priority scoring

Each cluster gets a priority score (0–100) based on three factors:

Vote count — total votes across all feedback in the cluster (weighted 50%)
Feedback count — how many separate users reported this (weighted 30%)
Recency — average age of feedback items (weighted 20%)

High-priority clusters (70+) are highlighted in red, medium (40–69) in amber, and low in green. This gives product managers an instant view of what matters most.

The full pipeline

Putting it all together, every feedback submission triggers this async pipeline:

User submits feedback via widget or public board
Text is embedded using Voyage AI (1024-dim vector)
Cosine similarity search against existing clusters
Join existing cluster or create new one
Claude generates/updates cluster name + summary + sentiment
Priority score recalculated

The entire pipeline runs in under 2 seconds. Users see their feedback instantly, and admins see it auto-categorized in the cluster view.

Try FeedMission for free

Collect feedback, cluster with AI, and ship better products.

Get Started Free

More from the blog

Product

Building a Feedback Widget That Works Everywhere: Web, iOS, Android

Guide

From User Feedback to Shipped Features: The Complete Workflow