Semantic Search

NeuralRepo goes beyond keyword matching. Every idea is converted into a vector embedding that captures its meaning, enabling search by concept rather than exact words.

Embedding Model

NeuralRepo uses the @cf/baai/bge-m3 model running on Cloudflare Workers AI:

Property	Value
Model	`@cf/baai/bge-m3`
Dimensions	1024
Max input tokens	8192
Multilingual	Yes (100+ languages)
Runtime	Cloudflare Workers AI (on-device GPU)

The BGE-M3 model was chosen for its strong multilingual performance, high dimension count for nuanced similarity, and native availability on Cloudflare’s infrastructure with zero cold start.

Vector Storage

Embeddings are stored in Cloudflare Vectorize with the following configuration:

Setting	Value
Metric	Cosine similarity
Dimensions	1024
Metadata	`user_id`, `status`, `tags`, `created_at`

Metadata is stored alongside each vector to enable filtered queries (e.g., search only within building status ideas).

Embedding Pipeline

When a new idea is created, the embedding process runs asynchronously:

Idea Created
    │
    ▼
Queue Message (idea_created)
    │
    ▼
Text Normalization
    │  - Combine title + body
    │  - Trim whitespace
    │  - Collapse multiple spaces/newlines
    │  - Truncate to 8,192 characters
    │
    ▼
Workers AI Embedding
    │  - POST @cf/baai/bge-m3
    │  - Returns 1024-dimension vector
    │
    ▼
Vectorize Upsert
    │  - ID: idea_{user_id}_{idea_id}
    │  - Vector: 1024 floats
    │  - Metadata: user_id, status, tags
    │
    ▼
Duplicate Detection
    │  - Query topK=5 for similar vectors
    │  - Flag pairs above similarity threshold
    │
    ▼
Done

Search Pipeline

When you search for ideas, the query goes through a hybrid pipeline:

Search Query
    │
    ├──────────────────┐
    ▼                  ▼
FTS5 Keyword Search    Vectorize Semantic Search
    │                  │  - Embed query text
    │                  │  - topK nearest neighbors
    │                  │  - Filter by user_id
    │                  │  - Optional: filter by status, tag
    │                  │
    ▼                  ▼
    └──────┬───────────┘
           │
           ▼
      Rerank & Merge
           │  - Combine FTS and vector results
           │  - Deduplicate by idea ID
           │  - Score: weighted blend of BM25 + cosine
           │
           ▼
      Return Results
           │  - Ordered by combined score
           │  - Includes similarity score per result

The hybrid approach ensures that exact keyword matches rank highly while semantically similar ideas (different words, same meaning) are also surfaced.

Duplicate Detection

When a new idea is embedded, NeuralRepo automatically checks for duplicates:

Threshold	Action
>= 0.75	Flagged as potential duplicate — stored in `duplicate_detections` with `pending` status
0.50 - 0.74	Auto-linked as `related` — a relation is created automatically
< 0.50	No action

Duplicate detections are stored in the duplicate_detections table with the similarity score. Users can confirm (merge or link), dismiss, or ignore the detection.

Text Normalization

Before embedding, text is normalized to improve consistency:

Combine fields. Title and body are concatenated with a newline separator: {title}\n{body}.
Trim. Leading and trailing whitespace is removed.
Collapse whitespace. Multiple consecutive spaces or newlines are collapsed to a single space.
Truncate. The combined text is truncated to 8,192 characters to fit within the model’s context window.

Ideas with only a title (no body) are still embedded. The title alone provides enough signal for meaningful similarity matching.

Re-embedding

When an idea’s title or body is updated, a new embedding is generated:

A idea_updated queue message is dispatched.
The consumer re-normalizes and re-embeds the updated text.
The existing Vectorize record is upserted (replaced) with the new vector.
Duplicate detection runs again against the new embedding.

Metadata-only changes (status, tags) dispatch an idea_metadata_updated message that updates the Vectorize metadata without re-embedding.

Backfill

The backfill_vectors queue message triggers a full re-indexing of all ideas for a user. This is used for:

Account recovery after data issues
Model upgrades (if the embedding model changes)
Manual re-indexing triggered by support

# Admin-only: trigger backfill for a user
curl -X POST https://neuralrepo.com/__admin/backfill-vectors \
  -H "X-Admin-Key: $ADMIN_KEY" \
  -H "Content-Type: application/json" \
  -d '{"user_id": "user_abc123"}'