Semantic Search
NeuralRepo goes beyond keyword matching. Every idea is converted into a vector embedding that captures its meaning, enabling search by concept rather than exact words.
Embedding Model
Section titled “Embedding Model”NeuralRepo uses the @cf/baai/bge-m3 model running on Cloudflare Workers AI:
| Property | Value |
|---|---|
| Model | @cf/baai/bge-m3 |
| Dimensions | 1024 |
| Max input tokens | 8192 |
| Multilingual | Yes (100+ languages) |
| Runtime | Cloudflare Workers AI (on-device GPU) |
The BGE-M3 model was chosen for its strong multilingual performance, high dimension count for nuanced similarity, and native availability on Cloudflare’s infrastructure with zero cold start.
Vector Storage
Section titled “Vector Storage”Embeddings are stored in Cloudflare Vectorize with the following configuration:
| Setting | Value |
|---|---|
| Metric | Cosine similarity |
| Dimensions | 1024 |
| Metadata | user_id, status, tags, created_at |
Metadata is stored alongside each vector to enable filtered queries (e.g., search only within building status ideas).
Embedding Pipeline
Section titled “Embedding Pipeline”When a new idea is created, the embedding process runs asynchronously:
Idea Created │ ▼Queue Message (idea_created) │ ▼Text Normalization │ - Combine title + body │ - Trim whitespace │ - Collapse multiple spaces/newlines │ - Truncate to 8,192 characters │ ▼Workers AI Embedding │ - POST @cf/baai/bge-m3 │ - Returns 1024-dimension vector │ ▼Vectorize Upsert │ - ID: idea_{user_id}_{idea_id} │ - Vector: 1024 floats │ - Metadata: user_id, status, tags │ ▼Duplicate Detection │ - Query topK=5 for similar vectors │ - Flag pairs above similarity threshold │ ▼DoneSearch Pipeline
Section titled “Search Pipeline”When you search for ideas, the query goes through a hybrid pipeline:
Search Query │ ├──────────────────┐ ▼ ▼FTS5 Keyword Search Vectorize Semantic Search │ │ - Embed query text │ │ - topK nearest neighbors │ │ - Filter by user_id │ │ - Optional: filter by status, tag │ │ ▼ ▼ └──────┬───────────┘ │ ▼ Rerank & Merge │ - Combine FTS and vector results │ - Deduplicate by idea ID │ - Score: weighted blend of BM25 + cosine │ ▼ Return Results │ - Ordered by combined score │ - Includes similarity score per resultThe hybrid approach ensures that exact keyword matches rank highly while semantically similar ideas (different words, same meaning) are also surfaced.
Duplicate Detection
Section titled “Duplicate Detection”When a new idea is embedded, NeuralRepo automatically checks for duplicates:
| Threshold | Action |
|---|---|
| >= 0.75 | Flagged as potential duplicate — stored in duplicate_detections with pending status |
| 0.50 - 0.74 | Auto-linked as related — a relation is created automatically |
| < 0.50 | No action |
Duplicate detections are stored in the duplicate_detections table with the similarity score. Users can confirm (merge or link), dismiss, or ignore the detection.
Text Normalization
Section titled “Text Normalization”Before embedding, text is normalized to improve consistency:
- Combine fields. Title and body are concatenated with a newline separator:
{title}\n{body}. - Trim. Leading and trailing whitespace is removed.
- Collapse whitespace. Multiple consecutive spaces or newlines are collapsed to a single space.
- Truncate. The combined text is truncated to 8,192 characters to fit within the model’s context window.
Ideas with only a title (no body) are still embedded. The title alone provides enough signal for meaningful similarity matching.
Re-embedding
Section titled “Re-embedding”When an idea’s title or body is updated, a new embedding is generated:
- A
idea_updatedqueue message is dispatched. - The consumer re-normalizes and re-embeds the updated text.
- The existing Vectorize record is upserted (replaced) with the new vector.
- Duplicate detection runs again against the new embedding.
Metadata-only changes (status, tags) dispatch an idea_metadata_updated message that updates the Vectorize metadata without re-embedding.
Backfill
Section titled “Backfill”The backfill_vectors queue message triggers a full re-indexing of all ideas for a user. This is used for:
- Account recovery after data issues
- Model upgrades (if the embedding model changes)
- Manual re-indexing triggered by support
# Admin-only: trigger backfill for a usercurl -X POST https://neuralrepo.com/__admin/backfill-vectors \ -H "X-Admin-Key: $ADMIN_KEY" \ -H "Content-Type: application/json" \ -d '{"user_id": "user_abc123"}'