Sleep Agent
Comprehensive guide to the Sleep Agent's background maintenance pipeline that keeps your memory system optimized, connected, and organized.
Table of Contents
- What is the Sleep Agent?
- Why Automatic Maintenance?
- Execution Schedule
- The 10-Step Pipeline
- Step Details
- Viewing Sleep History
- Understanding Sleep Metrics
- Manual Trigger
- Troubleshooting Sleep Failures
- Performance Impact
What is the Sleep Agent?
The Sleep Agent is an automated background maintenance system that runs periodically to:
- Organize: Generate semantic tags for untagged memories
- Connect: Create relationships between related memories
- Optimize: Move memories to appropriate storage tiers (hot/warm/archive)
- Summarize: Regenerate summaries for tiered memories
- Maintain: Update priority scores and decay metrics
Think of it as a "janitorial service" for your knowledge base—quietly running in the background to keep everything organized and accessible.
Why Automatic Maintenance?
Manual memory management doesn't scale. As your knowledge base grows to thousands of memories, the Sleep Agent provides:
1. Automatic Organization
- Generates semantic tags from content (e.g., "authentication", "database", "security")
- No manual tagging required—tags emerge organically from usage
2. Intelligent Relationships
- Discovers connections between memories based on shared tags, entities, and topics
- Builds a knowledge graph automatically
- Surfaces related memories during retrieval
3. Cost Optimization
- Moves rarely-accessed memories to cheaper archive tier
- Keeps frequently-used memories in fast hot tier
- Reduces storage costs by 60-80% for large knowledge bases
4. Search Quality
- Updates priority scores based on access patterns
- Decay scores reflect staleness—old memories rank lower
- Relationship scores boost connected memories in search results
5. Zero User Effort
- Runs automatically every 60 minutes
- No configuration required
- Gracefully handles failures and retries
Execution Schedule
Default Schedule
- Frequency: Every 60 minutes
- Trigger: Durable Object scheduler (Cloudflare)
- Execution: Render queue worker processes sleep job
Timing
00:00 - Sleep cycle 1 starts
00:03 - Sleep cycle 1 completes (3min 45sec)
01:00 - Sleep cycle 2 starts
01:02 - Sleep cycle 2 completes (2min 10sec)
02:00 - Sleep cycle 3 starts
...
Concurrency
- One sleep job per organization at a time
- If previous cycle still running, new cycle skips
- Local lock prevents duplicate executions
Configuration
Schedule can be adjusted via Durable Object:
// In /apps/durable/src/index.ts
const SLEEP_INTERVAL_MS = 60 * 60 * 1000; // 60 minutes
The 10-Step Pipeline
Each sleep cycle processes memories through ten sequential steps. Steps are checkpointed---if a step fails, the next cycle resumes from the last successful step.
┌────────────────────────────────────────────────────────────────────────┐
│ Sleep Cycle │
│ │
│ ┌─────────┐ ┌─────┐ ┌───────┐ ┌────────────┐ ┌───────┐ │
│ │ Collect │→│ Tag │→│Extract│→│ Detect │→│ Decay │ │
│ │ (100) │ │(20) │ │ Facts │ │Contradict. │ │Confid.│ │
│ └─────────┘ └─────┘ └───────┘ └────────────┘ └───────┘ │
│ │ │ │
│ │ ┌─────────┐ ┌────────┐ ┌──────┐ ┌──────┐ ┌───────────┐ │
│ └──→ │ Build │→│ Reason │→│ Link │→│ Tier │→│ Summarize │ │
│ │Profiles │ │(5 ent.)│ │(30) │ │(50) │ │ (8) │ │
│ └─────────┘ └────────┘ └──────┘ └──────┘ └───────────┘ │
│ │
│ Duration: ~5-10 minutes per cycle │
└────────────────────────────────────────────────────────────────────────┘
Numbers in parentheses indicate typical counts per cycle.
New Cognitive Steps
The pipeline now includes five cognitive steps beyond the original five:
| Step | Purpose |
|---|---|
| Extract Facts | gpt-4o-mini extracts atomic facts with source weighting, noise filtering, and surprisal scoring |
| Detect Contradictions | Creates first-class contradiction records; auto-resolves high-gap conflicts, keeps close calls for review (max 20/cycle) |
| Decay Confidence | Reduces confidence on unreinforced facts (2%/week), boosts corroborated facts, exempts user corrections |
| Build Profiles | Constructs entity profiles with narrative summaries (LLM-generated prose for top 3 entities) |
| Reason | Derives deductions (2+ premises, conf 0.80-0.90) and inductions (3+ data points, conf 0.60-0.75) for top 5 entities |
Step Details
Step 1: Collect Candidates (collectCandidates)
Purpose: Fetch memories needing maintenance and update their baseline scores.
Logic:
-
Query Convex for up to 100 memories based on:
- Last updated timestamp
- Missing or stale auto-tags
- Low relationship score
- Outdated priority/decay scores
-
Update priority and decay scores:
terminalageHours = (now - memory.updatedAt) / 3600000 decayBoost = min(0.2, ageHours / 720) // Max 0.2 over 30 days newDecay = min(1, memory.decayScore + decayBoost) newPriority = max(0, memory.priority - decayBoost / 2) -
Store top 50 candidates for processing in subsequent steps
Metrics:
candidates: Total memories fetchedadjustmentsConsidered: Memories needing score updatesadjustmentsApplied: Memories successfully updated
Example Output:
Collected 100 candidates, applied 87 priority adjustments
Step 2: Tag (tag)
Purpose: Generate semantic tags for memories missing or outdated tags.
Logic:
-
Filter candidates needing re-tagging:
autoTagsis empty, OR- Last tagged more than 7 days ago
-
Limit to top 20 candidates (rate limit OpenAI API)
-
For each candidate:
- Fetch full memory content from Convex
- Generate tags via rule-based extraction + keyword heuristics:
- Extract source-based tags (e.g., "upload", "google_drive")
- Parse title and content for semantic keywords (min 4 chars)
- Combine rule tags + keyword tags (max 6 tags)
-
Update memory with new
autoTags -
Merge
autoTags+manualTagsinto unifiedtagsarray -
Record
lastTaggedAttimestamp
Tag Sources:
- Auto Tags: Generated by Sleep Agent (editable, refreshed every 7 days)
- Manual Tags: User-assigned via UI (emerald badges, never overwritten)
- Combined Tags: Union of auto + manual (used for search and relationships)
Fallback Tagging: If a memory has zero tags after processing:
- Assign source-based fallback tags (e.g., "upload", "connector")
- Enqueue maintenance job for deeper analysis
Metrics:
tagsRefreshed: Memories with updated auto-tagstagsAdded: Memories receiving fallback tagsmaintenanceEnqueued: Memories needing deeper tagging
Example Output:
Refreshed tags for 12 memories, added fallback tags to 3
Step 3: Link (link)
Purpose: Discover and create relationships between related memories.
Logic:
-
Generate relationship proposals using graph analysis:
- Compare all candidate pairs (N × N comparisons)
- Calculate shared tags between each pair
- Compute Jaccard similarity (intersection / union)
- Check if memories share same source
-
Score each proposal:
terminalconfidence = jaccardSimilarity if (sameSource) confidence += 0.15 if (sharedTags.length > 0) confidence += 0.20 if (sameTier) confidence += 0.05 -
Filter proposals:
- Minimum confidence: 0.5 (configurable)
- Require either shared tags OR same source (prevent low-signal links)
- Deduplicate by normalized pair (A→B same as B→A)
-
Rank by confidence, take top 30 proposals
-
Create edges in Convex:
terminalmemoryEdge = { fromId: proposal.fromId, toId: proposal.toId, relation: 'related' | 'same_source', weight: min(1, max(0.35, confidence)), confidence: proposal.confidence, metadata: { method: 'sleep-agent', sharedTags: proposal.sharedTags, rationale: proposal.rationale } }
Relationship Types:
related: Shared tags indicate semantic similaritysame_source: Both from same origin (e.g., same PDF, same chat session)
Metrics:
linkAttempts: Relationship proposals generatededgesCreated: Edges successfully stored in Convex
Example Output:
Generated 24 relationship proposals, created 18 edges
Step 4: Tier (tier)
Purpose: Move memories to appropriate storage tiers based on usage patterns.
Tier Definitions:
| Tier | Description | Storage | Retrieval Speed |
|---|---|---|---|
| Hot | Actively used memories | Fast SSD, RAM cache | <50ms |
| Warm | Occasionally accessed | Standard SSD | 50-200ms |
| Archive | Rarely accessed | Compressed, slower disk | 200-1000ms |
Tier Qualification Logic:
A memory stays in hot if it meets ANY of:
- Priority ≥ 0.65 (high priority)
- Decay score ≤ 0.25 (low decay, recently relevant)
- Accessed within last 3 days
- Relationship score ≥ 6 (densely connected)
- Recent edge count ≥ 4 (actively linked)
- Manually pinned by user (
isPinned: true)
Moves to warm if it meets ANY of:
- Priority ≥ 0.3 (moderate priority)
- Accessed within last 21 days
- Has manual tags (user-curated)
- Relationship score ≥ 3 (moderately connected)
Moves to archive if ALL of:
- Not accessed for 30+ days (45+ for deep cold)
- Low priority (< 0.3)
- High decay score (≥ 0.6-0.8)
- Low connectivity (≤ 2 relationships, no recent edges)
- No manual tags
Tier Change Reasons:
high_priority: Priority score above hot thresholdlow_decay: Decay score indicates recent relevancerecent_access: Accessed within 3 daysactive_connections: Many recent edges createddense_graph: High relationship scoremanual_pin: User pinned to hot tiermoderate_priority: Warm tier qualified by priorityrecently_accessed: Accessed within 21 daysconnected_graph: Warm tier qualified by relationshipsmanual_tags_present: User-tagged memories stay warmstale_and_low_priority: Archive qualified by inactivitycold_and_disconnected: Archive qualified by age + low connectivity
Processing:
- Evaluate each candidate's target tier
- Compare to current tier
- If different, call
mutations/memory:moveToTier:- Update
memoryItems.tierin Convex - Update
memoryChunks.layerfor all chunks - Record tier change reason
- Update
- Record telemetry event (promoted or demoted)
Metrics:
tierCandidates: Memories evaluated for tier changetierChanges: Memories successfully moved
Example Output:
Evaluated 50 memories, applied 8 tier changes (3 promoted, 5 demoted)
Step 5: Summarize (summarize)
Purpose: Regenerate summaries for tiered memories and enqueue follow-up maintenance.
Logic:
-
For each tier change (from Step 4):
- If moved to
warmorarchive:- Fetch full memory content
- Generate condensed summary (3-5 sentences)
- Update
memoryItems.summaryin Convex - Record metadata:
summaryRefreshedAt,summaryRefreshedBy: 'sleep_agent'
- If moved to
-
Enqueue tier transition maintenance:
- Create maintenance job in Convex:
mutations/system:enqueueMemoryMaintenance - Task type:
tier_transition - Priority: 0.9 for archive (urgent), 0.7 for warm, 0.55 for hot
- Payload:
{ fromTier, toTier, reason }
- Create maintenance job in Convex:
Summary Generation:
// For archive tier (aggressive compression):
summary = summarize(content, { maxSentences: 3 })
// For warm tier (balanced):
summary = summarize(content, { maxSentences: 5 })
// Context included:
summaryContext = [
`Title: ${memory.title}`,
`Source: ${memory.source}`,
`Tier change reason: ${change.reason}`,
`Tags: ${combinedTags.join(', ')}`,
memory.content
].join('\n\n')
Maintenance Jobs: Future processing triggered by tier transitions:
- Archive: Compress embeddings, remove detailed metadata
- Hot Promotion: Refresh embeddings, update indexes
- Cross-Tier Sync: Ensure consistency across Convex + Chroma
Metrics:
summariesRegenerated: Summaries updatedtierMaintenanceEnqueued: Follow-up jobs created
Example Output:
Regenerated 5 summaries, enqueued 8 tier maintenance jobs
Viewing Sleep History
Via Web Console
- Navigate to Topology: Visit https://kybernesis.com/arcana
- Open Sleep Tab: Click "Sleep Agent" tab in sidebar
- View Recent Runs: See list of recent sleep cycles with:
- Run ID
- Status (completed, failed, running)
- Triggered timestamp
- Duration
- Tasks completed/failed
- Detailed notes
Example Sleep Run:
Run ID: sleep_run_abc123
Status: ✅ Completed
Triggered: 2025-10-24 10:00:00 UTC
Duration: 3min 45sec
Tasks Completed: 127
Tasks Failed: 3
Notes: Adjusted 50 memories, added 8 tags, refreshed 12 tags,
created 18 relationships, applied 8 tier changes,
refreshed 5 summaries, enqueued 8 tier maintenance jobs
Via API
Fetch sleep history programmatically:
curl -X GET "https://api.kybernesis.ai/telemetry/sleep?orgId=YOUR_ORG_ID&limit=10" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "x-org-id: YOUR_ORG_ID"
Response:
{
"runs": [
{
"runId": "sleep_run_abc123",
"orgId": "00000000-0000-0000-0000-000000000000",
"status": "completed",
"triggeredAt": "2025-10-24T10:00:00Z",
"completedAt": "2025-10-24T10:03:45Z",
"durationMs": 225000,
"tasksCompleted": 127,
"tasksFailed": 3,
"notes": "Adjusted 50 memories, created 18 relationships..."
}
]
}
Understanding Sleep Metrics
Step Metrics
Each step records detailed counts:
{
"collectCandidates": {
"status": "completed",
"durationMs": 1200,
"counts": {
"candidates": 100,
"adjustmentsConsidered": 87,
"adjustmentsApplied": 87
}
},
"tag": {
"status": "completed",
"durationMs": 45000,
"counts": {
"candidates": 50,
"tagsRefreshed": 12,
"tagsAdded": 3,
"maintenanceEnqueued": 0
}
},
"link": {
"status": "completed",
"durationMs": 18000,
"counts": {
"linkAttempts": 24,
"edgesCreated": 18
}
},
"tier": {
"status": "completed",
"durationMs": 8000,
"counts": {
"tierCandidates": 50,
"tierChanges": 8
}
},
"summarize": {
"status": "completed",
"durationMs": 12000,
"counts": {
"tierChanges": 8,
"summariesRegenerated": 5,
"tierMaintenanceEnqueued": 8
}
}
}
Performance Indicators
Healthy Sleep Cycle:
- Duration: 2-5 minutes
- Tasks completed: 50-200
- Tasks failed: 0-5 (< 5% failure rate)
- All steps status:
completed
Degraded Performance:
- Duration: >10 minutes
- Tasks failed: >20 (> 10% failure rate)
- Steps status:
failedorskipped
Critical Issues:
- Status:
failed - Tasks completed: 0
- Error message in notes
Common Metrics
| Metric | Good | Warning | Critical |
|---|---|---|---|
| Duration | <5min | 5-10min | >10min |
| Success Rate | >95% | 90-95% | <90% |
| Candidates Processed | 50-100 | 20-50 | <20 |
| Edges Created | 10-30 | 5-10 | <5 |
| Tier Changes | 5-20 | 0-5 | 0 (stale) |
Manual Trigger
Manually trigger a sleep cycle outside the regular schedule.
Via API
curl -X POST https://api.kybernesis.ai/scheduler/run \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "x-org-id: YOUR_ORG_ID"
Response:
{
"status": "queued",
"runId": "sleep_run_manual_xyz",
"message": "Sleep job enqueued. Results available in 3-5 minutes."
}
Via Web Console
- Navigate to Sleep Tab: Visit topology page
- Click "Run Now": Button in sleep history section
- Confirm: Modal confirms manual trigger
- Monitor: Status updates in real-time (refresh page)
When to Trigger Manually
Use manual triggers when:
- After Bulk Upload: Just uploaded 100 PDFs, want immediate tagging
- Before Demo: Ensure knowledge graph is up-to-date
- Troubleshooting: Test if sleep agent is working correctly
- Force Re-Tag: Changed tagging logic, want to re-process all memories
- Connector Sync: Just synced Google Drive, want relationships created immediately
Note: Manual triggers respect local lock—if a cycle is already running, the new trigger will skip.
Troubleshooting Sleep Failures
Symptom: Sleep Cycle Failed
Check:
- View sleep history in web console
- Look for error message in notes field
- Check Render queue worker logs
Common Causes:
- OpenAI API Rate Limit: Exceeded embedding quota
- Convex Mutation Timeout: Large batch update took >10s
- Chroma Connection Error: Chroma service unreachable
- Redis Queue Full: Too many jobs in queue
Solutions:
- Rate Limit: Wait 60 seconds, cycle will retry next hour
- Timeout: Reduce
TAG_REFRESH_LIMITfrom 20 to 10 - Connection: Check Chroma service health, restart if needed
- Queue Full: Scale queue worker instances (1 → 3)
Symptom: Sleep Cycle Skipped
Check:
- Look for "skipped" status in sleep history
- Check
reasonfield in telemetry
Common Reasons:
active_run: Previous cycle still running (normal)local_lock: Another worker already processing (normal in multi-instance setup)
Solutions:
- Active Run: Wait for previous cycle to complete (5-10min)
- Local Lock: No action needed, concurrent protection working correctly
Symptom: No Tier Changes
Check:
- Verify memories meet tier change criteria
- Check
lastAccessedAttimestamps - Confirm priority/decay scores are updating
Common Causes:
- All memories accessed recently (within 3 days)
- Priority scores artificially high
- Manual pins preventing demotion
Solutions:
- Recent Access: Normal—memories stay hot when used
- High Priority: Review priority calculation logic
- Manual Pins: Unpin memories that no longer need hot tier
Symptom: Too Many Failed Tasks
Check:
- Calculate failure rate:
tasksFailed / (tasksCompleted + tasksFailed) - Identify which step has highest failure rate
- Review step-specific error logs
Common Causes:
- Tag Step: OpenAI API errors (network, quota, invalid response)
- Link Step: Convex mutation failures (validation errors)
- Tier Step: Race conditions (memory deleted during processing)
Solutions:
- Tag Step: Add retry logic, fallback to keyword-only tagging
- Link Step: Validate proposals before mutation, skip invalid edges
- Tier Step: Check memory existence before tier change, gracefully skip deleted
Symptom: Sleep Cycle Never Starts
Check:
- Verify Durable Object scheduler is running
- Check scheduler metrics:
GET /metrics/scheduler - Review Cloudflare Worker logs
Common Causes:
- Durable Object not bound in
wrangler.toml - Scheduler alarm not set
- Queue worker unreachable from Workers
Solutions:
- Missing Binding: Add
KYBERNESIS_SCHEDULERbinding towrangler.toml - No Alarm: Trigger scheduler manually once to initialize
- Unreachable Queue: Check
QUEUE_API_URLenvironment variable
Performance Impact
User-Facing Impact
During Sleep Cycle:
- Search Latency: +5-10ms (Chroma busy with embeddings)
- Ingestion Latency: +10-20ms (queue contention)
- Memory Updates: May conflict with tier changes (rare)
After Sleep Cycle:
- Search Quality: +15-25% improvement (better tags, relationships)
- Storage Cost: -10-30% reduction (archive tier compression)
- Graph Density: +5-10% more edges (relationship discovery)
System Resource Usage
Queue Worker (during sleep):
- CPU: 40-70% per worker (LLM calls, graph analysis)
- Memory: 500MB-1GB (candidate processing, embeddings)
- Network: 50-100 req/s to OpenAI, Convex, Chroma
Convex (during sleep):
- Mutations: 100-300 per cycle
- Queries: 50-100 per cycle
- Database load: <10% of capacity
Chroma (during sleep):
- Embeddings added: 0-50 (new relationships)
- Query load: Minimal (no vector search in sleep agent)
Optimization Tips
- Reduce Frequency: Change from 60min to 120min if load is high
- Lower Limits: Reduce
TAG_REFRESH_LIMITfrom 20 to 10 - Skip Steps: Disable linking if graph is dense enough
- Batch Updates: Combine multiple Convex mutations into single transaction
- Parallel Processing: Run tag and tier steps concurrently (future enhancement)
Load Testing Results
Scenario: 10,000 memories, sleep cycle every 60 minutes
| Metric | Before Sleep | During Sleep | After Sleep |
|---|---|---|---|
| Hybrid Search (p95) | 120ms | 135ms (+12%) | 110ms (-8%) |
| Memory List (p95) | 80ms | 90ms (+12%) | 75ms (-6%) |
| Ingestion (p95) | 150ms | 170ms (+13%) | 145ms (-3%) |
| Queue Depth | 5 jobs | 8 jobs (+60%) | 3 jobs (-40%) |
Conclusion: Slight performance degradation during sleep cycle (10-15% latency increase), but overall improvement in search quality and reduced storage costs outweigh temporary slowdown.
Next Steps
- View Sleep History: Check web console topology tab for recent runs
- Explore Memory Tiering: Read memory system documentation
- Understand Relationships: Learn about knowledge graph visualization
- Optimize Tags: Review tag management strategies
Support
- GitHub: ianborders/kybernesis-brain
- Issues: GitHub Issues
- Email: support@kybernesis.ai