Queue Processing

NeuralRepo processes ideas asynchronously using Cloudflare Queues. When you create or update an idea, the API returns immediately while background processing handles embedding, duplicate detection, auto-tagging, and vector indexing.

Architecture Overview

API Request (create/update idea)
    │
    ▼
Response returned immediately
    │  (processing: true)
    │
    ▼
Queue Producer
    │  Publishes message to Cloudflare Queue
    │
    ▼
Queue Consumer (Workers handler)
    │  Receives batch of messages
    │  Routes by message type
    │
    ├── idea_created
    ├── idea_updated
    ├── idea_metadata_updated
    └── backfill_vectors

The queue consumer runs as a Cloudflare Workers handler bound to the queue. Messages are processed in batches for efficiency.

Message Types

idea_created

Triggered when a new idea is saved. This is the most comprehensive pipeline.

Payload:

{
  "type": "idea_created",
  "idea_id": 42,
  "user_id": "user_abc123"
}

Processing steps:

Fetch idea from D1 (title + body).
Normalize text — combine title and body, trim, collapse whitespace, truncate to 8,192 characters.
Generate embedding via Workers AI (@cf/baai/bge-m3).
Upsert to Vectorize with metadata (user_id, status, tags, created_at).
Duplicate detection — query Vectorize for top 5 nearest neighbors. Flag pairs above 0.75 similarity threshold.
Auto-tag (if BYOK key available) — use AI to suggest tags based on content. Store suggestions for user review.
Update idea — set processing: false in D1.

idea_updated

Triggered when an idea’s title or body changes.

Payload:

{
  "type": "idea_updated",
  "idea_id": 42,
  "user_id": "user_abc123"
}

Processing steps:

Fetch updated idea from D1.
Normalize text and re-generate embedding.
Upsert to Vectorize — replaces the existing vector with the new one.
Re-run duplicate detection against the updated embedding.
Update idea — set processing: false.

idea_metadata_updated

Triggered when only metadata changes (status, tags) — no content change.

Payload:

{
  "type": "idea_metadata_updated",
  "idea_id": 42,
  "user_id": "user_abc123",
  "metadata": {
    "status": "building",
    "tags": ["cli", "devtools"]
  }
}

Processing steps:

Update Vectorize metadata only. The vector itself is not regenerated since the content has not changed.
This ensures filtered searches (e.g., “search within building status”) reflect the latest metadata.

backfill_vectors

Triggered manually for full re-indexing of a user’s ideas.

Payload:

{
  "type": "backfill_vectors",
  "user_id": "user_abc123"
}

Processing steps:

Fetch all active ideas for the user from D1.
Batch embed — process ideas in batches of 10 to avoid rate limits.
Batch upsert to Vectorize.
Run duplicate detection across all pairs above the threshold.

Error Handling

The queue consumer handles errors with a retry strategy:

Scenario	Behavior
Workers AI unavailable	Retry with exponential backoff (3 attempts)
Vectorize write failure	Retry with exponential backoff (3 attempts)
D1 read failure	Retry once, then dead-letter
Invalid message format	Log error, discard message
All retries exhausted	Message moves to dead-letter queue for manual review

Failed messages are logged with the idea ID, user ID, error message, and attempt count for debugging.

Cron Triggers

NeuralRepo uses three Cloudflare Workers Cron Triggers for scheduled tasks:

Weekly Digest

Property	Value
Schedule	`0 18 * * SUN` (Sunday 6:00 PM UTC)
Purpose	Generate and send weekly digest emails

The digest cron:

Queries all users with digest_enabled = true.
For each user, gathers the week’s new ideas, status changes, and stats.
If the user has a BYOK key, calls the AI provider for a narrative summary.
Formats and sends the digest email.

Stale Idea Check

Property	Value
Schedule	`0 9 * * 1` (Monday 9:00 AM UTC)
Purpose	Flag ideas that have not been updated recently

The stale check cron:

Queries ideas in exploring or building status that have not been updated in 14+ days.
Updates internal metadata to flag them as stale.
The flags are included in the next weekly digest.

GitHub Sync

Property	Value
Schedule	`0 3 * * *` (daily 3:00 AM UTC)
Purpose	Sync ideas with linked GitHub repositories

The GitHub sync cron syncs idea metadata with connected GitHub repositories for users who have configured GitHub integration.

Monitoring

Queue health can be monitored through the Cloudflare dashboard:

Messages in queue — should be near zero during normal operation.
Messages processed per minute — spikes after bulk imports.
Dead-letter count — should be zero. Any messages here indicate a processing bug.
Consumer latency — time from message publish to processing completion.