Skip to content

Queue Processing

NeuralRepo processes ideas asynchronously using Cloudflare Queues. When you create or update an idea, the API returns immediately while background processing handles embedding, duplicate detection, auto-tagging, and vector indexing.

API Request (create/update idea)
Response returned immediately
│ (processing: true)
Queue Producer
│ Publishes message to Cloudflare Queue
Queue Consumer (Workers handler)
│ Receives batch of messages
│ Routes by message type
├── idea_created
├── idea_updated
├── idea_metadata_updated
└── backfill_vectors

The queue consumer runs as a Cloudflare Workers handler bound to the queue. Messages are processed in batches for efficiency.

Triggered when a new idea is saved. This is the most comprehensive pipeline.

Payload:

{
"type": "idea_created",
"idea_id": 42,
"user_id": "user_abc123"
}

Processing steps:

  1. Fetch idea from D1 (title + body).
  2. Normalize text — combine title and body, trim, collapse whitespace, truncate to 8,192 characters.
  3. Generate embedding via Workers AI (@cf/baai/bge-m3).
  4. Upsert to Vectorize with metadata (user_id, status, tags, created_at).
  5. Duplicate detection — query Vectorize for top 5 nearest neighbors. Flag pairs above 0.75 similarity threshold.
  6. Auto-tag (if BYOK key available) — use AI to suggest tags based on content. Store suggestions for user review.
  7. Update idea — set processing: false in D1.

Triggered when an idea’s title or body changes.

Payload:

{
"type": "idea_updated",
"idea_id": 42,
"user_id": "user_abc123"
}

Processing steps:

  1. Fetch updated idea from D1.
  2. Normalize text and re-generate embedding.
  3. Upsert to Vectorize — replaces the existing vector with the new one.
  4. Re-run duplicate detection against the updated embedding.
  5. Update idea — set processing: false.

Triggered when only metadata changes (status, tags) — no content change.

Payload:

{
"type": "idea_metadata_updated",
"idea_id": 42,
"user_id": "user_abc123",
"metadata": {
"status": "building",
"tags": ["cli", "devtools"]
}
}

Processing steps:

  1. Update Vectorize metadata only. The vector itself is not regenerated since the content has not changed.
  2. This ensures filtered searches (e.g., “search within building status”) reflect the latest metadata.

Triggered manually for full re-indexing of a user’s ideas.

Payload:

{
"type": "backfill_vectors",
"user_id": "user_abc123"
}

Processing steps:

  1. Fetch all active ideas for the user from D1.
  2. Batch embed — process ideas in batches of 10 to avoid rate limits.
  3. Batch upsert to Vectorize.
  4. Run duplicate detection across all pairs above the threshold.

The queue consumer handles errors with a retry strategy:

ScenarioBehavior
Workers AI unavailableRetry with exponential backoff (3 attempts)
Vectorize write failureRetry with exponential backoff (3 attempts)
D1 read failureRetry once, then dead-letter
Invalid message formatLog error, discard message
All retries exhaustedMessage moves to dead-letter queue for manual review

Failed messages are logged with the idea ID, user ID, error message, and attempt count for debugging.

NeuralRepo uses three Cloudflare Workers Cron Triggers for scheduled tasks:

PropertyValue
Schedule0 18 * * SUN (Sunday 6:00 PM UTC)
PurposeGenerate and send weekly digest emails

The digest cron:

  1. Queries all users with digest_enabled = true.
  2. For each user, gathers the week’s new ideas, status changes, and stats.
  3. If the user has a BYOK key, calls the AI provider for a narrative summary.
  4. Formats and sends the digest email.
PropertyValue
Schedule0 9 * * 1 (Monday 9:00 AM UTC)
PurposeFlag ideas that have not been updated recently

The stale check cron:

  1. Queries ideas in exploring or building status that have not been updated in 14+ days.
  2. Updates internal metadata to flag them as stale.
  3. The flags are included in the next weekly digest.
PropertyValue
Schedule0 3 * * * (daily 3:00 AM UTC)
PurposeSync ideas with linked GitHub repositories

The GitHub sync cron syncs idea metadata with connected GitHub repositories for users who have configured GitHub integration.

Queue health can be monitored through the Cloudflare dashboard:

  • Messages in queue — should be near zero during normal operation.
  • Messages processed per minute — spikes after bulk imports.
  • Dead-letter count — should be zero. Any messages here indicate a processing bug.
  • Consumer latency — time from message publish to processing completion.