Sleep Agent

Comprehensive guide to the Sleep Agent's background maintenance pipeline that keeps your memory system optimized, connected, and organized.

Table of Contents

What is the Sleep Agent?

The Sleep Agent is an automated background maintenance system that runs periodically to:

  • Organize: Generate semantic tags for untagged memories
  • Connect: Create relationships between related memories
  • Optimize: Move memories to appropriate storage tiers (hot/warm/archive)
  • Summarize: Regenerate summaries for tiered memories
  • Maintain: Update priority scores and decay metrics

Think of it as a "janitorial service" for your knowledge base—quietly running in the background to keep everything organized and accessible.

Why Automatic Maintenance?

Manual memory management doesn't scale. As your knowledge base grows to thousands of memories, the Sleep Agent provides:

1. Automatic Organization

  • Generates semantic tags from content (e.g., "authentication", "database", "security")
  • No manual tagging required—tags emerge organically from usage

2. Intelligent Relationships

  • Discovers connections between memories based on shared tags, entities, and topics
  • Builds a knowledge graph automatically
  • Surfaces related memories during retrieval

3. Cost Optimization

  • Moves rarely-accessed memories to cheaper archive tier
  • Keeps frequently-used memories in fast hot tier
  • Reduces storage costs by 60-80% for large knowledge bases

4. Search Quality

  • Updates priority scores based on access patterns
  • Decay scores reflect staleness—old memories rank lower
  • Relationship scores boost connected memories in search results

5. Zero User Effort

  • Runs automatically every 60 minutes
  • No configuration required
  • Gracefully handles failures and retries

Execution Schedule

Default Schedule

  • Frequency: Every 60 minutes
  • Trigger: Durable Object scheduler (Cloudflare)
  • Execution: Render queue worker processes sleep job

Timing

terminal
00:00 - Sleep cycle 1 starts
00:03 - Sleep cycle 1 completes (3min 45sec)
01:00 - Sleep cycle 2 starts
01:02 - Sleep cycle 2 completes (2min 10sec)
02:00 - Sleep cycle 3 starts
...

Concurrency

  • One sleep job per organization at a time
  • If previous cycle still running, new cycle skips
  • Local lock prevents duplicate executions

Configuration

Schedule can be adjusted via Durable Object:

terminal
// In /apps/durable/src/index.ts
const SLEEP_INTERVAL_MS = 60 * 60 * 1000; // 60 minutes

The 10-Step Pipeline

Each sleep cycle processes memories through ten sequential steps. Steps are checkpointed---if a step fails, the next cycle resumes from the last successful step.

terminal
┌────────────────────────────────────────────────────────────────────────┐
│                           Sleep Cycle                                  │
│                                                                        │
│  ┌─────────┐ ┌─────┐ ┌───────┐ ┌────────────┐ ┌───────┐             │
│  │ Collect │→│ Tag │→│Extract│→│ Detect     │→│ Decay │              │
│  │  (100)  │ │(20) │ │ Facts │ │Contradict. │ │Confid.│              │
│  └─────────┘ └─────┘ └───────┘ └────────────┘ └───────┘              │
│       │                                            │                   │
│       │    ┌─────────┐ ┌────────┐ ┌──────┐ ┌──────┐ ┌───────────┐   │
│       └──→ │ Build   │→│ Reason │→│ Link │→│ Tier │→│ Summarize │   │
│            │Profiles │ │(5 ent.)│ │(30)  │ │(50)  │ │   (8)     │   │
│            └─────────┘ └────────┘ └──────┘ └──────┘ └───────────┘   │
│                                                                        │
│  Duration: ~5-10 minutes per cycle                                    │
└────────────────────────────────────────────────────────────────────────┘

Numbers in parentheses indicate typical counts per cycle.

New Cognitive Steps

The pipeline now includes five cognitive steps beyond the original five:

StepPurpose
Extract Factsgpt-4o-mini extracts atomic facts with source weighting, noise filtering, and surprisal scoring
Detect ContradictionsCreates first-class contradiction records; auto-resolves high-gap conflicts, keeps close calls for review (max 20/cycle)
Decay ConfidenceReduces confidence on unreinforced facts (2%/week), boosts corroborated facts, exempts user corrections
Build ProfilesConstructs entity profiles with narrative summaries (LLM-generated prose for top 3 entities)
ReasonDerives deductions (2+ premises, conf 0.80-0.90) and inductions (3+ data points, conf 0.60-0.75) for top 5 entities

Step Details

Step 1: Collect Candidates (collectCandidates)

Purpose: Fetch memories needing maintenance and update their baseline scores.

Logic:

  1. Query Convex for up to 100 memories based on:

    • Last updated timestamp
    • Missing or stale auto-tags
    • Low relationship score
    • Outdated priority/decay scores
  2. Update priority and decay scores:

    terminal
    ageHours = (now - memory.updatedAt) / 3600000
    decayBoost = min(0.2, ageHours / 720) // Max 0.2 over 30 days
    newDecay = min(1, memory.decayScore + decayBoost)
    newPriority = max(0, memory.priority - decayBoost / 2)
    
  3. Store top 50 candidates for processing in subsequent steps

Metrics:

  • candidates: Total memories fetched
  • adjustmentsConsidered: Memories needing score updates
  • adjustmentsApplied: Memories successfully updated

Example Output:

terminal
Collected 100 candidates, applied 87 priority adjustments

Step 2: Tag (tag)

Purpose: Generate semantic tags for memories missing or outdated tags.

Logic:

  1. Filter candidates needing re-tagging:

    • autoTags is empty, OR
    • Last tagged more than 7 days ago
  2. Limit to top 20 candidates (rate limit OpenAI API)

  3. For each candidate:

    • Fetch full memory content from Convex
    • Generate tags via rule-based extraction + keyword heuristics:
      • Extract source-based tags (e.g., "upload", "google_drive")
      • Parse title and content for semantic keywords (min 4 chars)
      • Combine rule tags + keyword tags (max 6 tags)
  4. Update memory with new autoTags

  5. Merge autoTags + manualTags into unified tags array

  6. Record lastTaggedAt timestamp

Tag Sources:

  • Auto Tags: Generated by Sleep Agent (editable, refreshed every 7 days)
  • Manual Tags: User-assigned via UI (emerald badges, never overwritten)
  • Combined Tags: Union of auto + manual (used for search and relationships)

Fallback Tagging: If a memory has zero tags after processing:

  • Assign source-based fallback tags (e.g., "upload", "connector")
  • Enqueue maintenance job for deeper analysis

Metrics:

  • tagsRefreshed: Memories with updated auto-tags
  • tagsAdded: Memories receiving fallback tags
  • maintenanceEnqueued: Memories needing deeper tagging

Example Output:

terminal
Refreshed tags for 12 memories, added fallback tags to 3

Step 3: Link (link)

Purpose: Discover and create relationships between related memories.

Logic:

  1. Generate relationship proposals using graph analysis:

    • Compare all candidate pairs (N × N comparisons)
    • Calculate shared tags between each pair
    • Compute Jaccard similarity (intersection / union)
    • Check if memories share same source
  2. Score each proposal:

    terminal
    confidence = jaccardSimilarity
    if (sameSource) confidence += 0.15
    if (sharedTags.length > 0) confidence += 0.20
    if (sameTier) confidence += 0.05
    
  3. Filter proposals:

    • Minimum confidence: 0.5 (configurable)
    • Require either shared tags OR same source (prevent low-signal links)
    • Deduplicate by normalized pair (A→B same as B→A)
  4. Rank by confidence, take top 30 proposals

  5. Create edges in Convex:

    terminal
    memoryEdge = {
      fromId: proposal.fromId,
      toId: proposal.toId,
      relation: 'related' | 'same_source',
      weight: min(1, max(0.35, confidence)),
      confidence: proposal.confidence,
      metadata: {
        method: 'sleep-agent',
        sharedTags: proposal.sharedTags,
        rationale: proposal.rationale
      }
    }
    

Relationship Types:

  • related: Shared tags indicate semantic similarity
  • same_source: Both from same origin (e.g., same PDF, same chat session)

Metrics:

  • linkAttempts: Relationship proposals generated
  • edgesCreated: Edges successfully stored in Convex

Example Output:

terminal
Generated 24 relationship proposals, created 18 edges

Step 4: Tier (tier)

Purpose: Move memories to appropriate storage tiers based on usage patterns.

Tier Definitions:

TierDescriptionStorageRetrieval Speed
HotActively used memoriesFast SSD, RAM cache<50ms
WarmOccasionally accessedStandard SSD50-200ms
ArchiveRarely accessedCompressed, slower disk200-1000ms

Tier Qualification Logic:

A memory stays in hot if it meets ANY of:

  • Priority ≥ 0.65 (high priority)
  • Decay score ≤ 0.25 (low decay, recently relevant)
  • Accessed within last 3 days
  • Relationship score ≥ 6 (densely connected)
  • Recent edge count ≥ 4 (actively linked)
  • Manually pinned by user (isPinned: true)

Moves to warm if it meets ANY of:

  • Priority ≥ 0.3 (moderate priority)
  • Accessed within last 21 days
  • Has manual tags (user-curated)
  • Relationship score ≥ 3 (moderately connected)

Moves to archive if ALL of:

  • Not accessed for 30+ days (45+ for deep cold)
  • Low priority (< 0.3)
  • High decay score (≥ 0.6-0.8)
  • Low connectivity (≤ 2 relationships, no recent edges)
  • No manual tags

Tier Change Reasons:

  • high_priority: Priority score above hot threshold
  • low_decay: Decay score indicates recent relevance
  • recent_access: Accessed within 3 days
  • active_connections: Many recent edges created
  • dense_graph: High relationship score
  • manual_pin: User pinned to hot tier
  • moderate_priority: Warm tier qualified by priority
  • recently_accessed: Accessed within 21 days
  • connected_graph: Warm tier qualified by relationships
  • manual_tags_present: User-tagged memories stay warm
  • stale_and_low_priority: Archive qualified by inactivity
  • cold_and_disconnected: Archive qualified by age + low connectivity

Processing:

  1. Evaluate each candidate's target tier
  2. Compare to current tier
  3. If different, call mutations/memory:moveToTier:
    • Update memoryItems.tier in Convex
    • Update memoryChunks.layer for all chunks
    • Record tier change reason
  4. Record telemetry event (promoted or demoted)

Metrics:

  • tierCandidates: Memories evaluated for tier change
  • tierChanges: Memories successfully moved

Example Output:

terminal
Evaluated 50 memories, applied 8 tier changes (3 promoted, 5 demoted)

Step 5: Summarize (summarize)

Purpose: Regenerate summaries for tiered memories and enqueue follow-up maintenance.

Logic:

  1. For each tier change (from Step 4):

    • If moved to warm or archive:
      • Fetch full memory content
      • Generate condensed summary (3-5 sentences)
      • Update memoryItems.summary in Convex
      • Record metadata: summaryRefreshedAt, summaryRefreshedBy: 'sleep_agent'
  2. Enqueue tier transition maintenance:

    • Create maintenance job in Convex: mutations/system:enqueueMemoryMaintenance
    • Task type: tier_transition
    • Priority: 0.9 for archive (urgent), 0.7 for warm, 0.55 for hot
    • Payload: { fromTier, toTier, reason }

Summary Generation:

terminal
// For archive tier (aggressive compression):
summary = summarize(content, { maxSentences: 3 })

// For warm tier (balanced):
summary = summarize(content, { maxSentences: 5 })

// Context included:
summaryContext = [
  `Title: ${memory.title}`,
  `Source: ${memory.source}`,
  `Tier change reason: ${change.reason}`,
  `Tags: ${combinedTags.join(', ')}`,
  memory.content
].join('\n\n')

Maintenance Jobs: Future processing triggered by tier transitions:

  • Archive: Compress embeddings, remove detailed metadata
  • Hot Promotion: Refresh embeddings, update indexes
  • Cross-Tier Sync: Ensure consistency across Convex + Chroma

Metrics:

  • summariesRegenerated: Summaries updated
  • tierMaintenanceEnqueued: Follow-up jobs created

Example Output:

terminal
Regenerated 5 summaries, enqueued 8 tier maintenance jobs

Viewing Sleep History

Via Web Console

  1. Navigate to Topology: Visit https://kybernesis.com/arcana
  2. Open Sleep Tab: Click "Sleep Agent" tab in sidebar
  3. View Recent Runs: See list of recent sleep cycles with:
    • Run ID
    • Status (completed, failed, running)
    • Triggered timestamp
    • Duration
    • Tasks completed/failed
    • Detailed notes

Example Sleep Run:

terminal
Run ID: sleep_run_abc123
Status: ✅ Completed
Triggered: 2025-10-24 10:00:00 UTC
Duration: 3min 45sec
Tasks Completed: 127
Tasks Failed: 3
Notes: Adjusted 50 memories, added 8 tags, refreshed 12 tags,
       created 18 relationships, applied 8 tier changes,
       refreshed 5 summaries, enqueued 8 tier maintenance jobs

Via API

Fetch sleep history programmatically:

terminal
curl -X GET "https://api.kybernesis.ai/telemetry/sleep?orgId=YOUR_ORG_ID&limit=10" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "x-org-id: YOUR_ORG_ID"

Response:

terminal
{
  "runs": [
    {
      "runId": "sleep_run_abc123",
      "orgId": "00000000-0000-0000-0000-000000000000",
      "status": "completed",
      "triggeredAt": "2025-10-24T10:00:00Z",
      "completedAt": "2025-10-24T10:03:45Z",
      "durationMs": 225000,
      "tasksCompleted": 127,
      "tasksFailed": 3,
      "notes": "Adjusted 50 memories, created 18 relationships..."
    }
  ]
}

Understanding Sleep Metrics

Step Metrics

Each step records detailed counts:

terminal
{
  "collectCandidates": {
    "status": "completed",
    "durationMs": 1200,
    "counts": {
      "candidates": 100,
      "adjustmentsConsidered": 87,
      "adjustmentsApplied": 87
    }
  },
  "tag": {
    "status": "completed",
    "durationMs": 45000,
    "counts": {
      "candidates": 50,
      "tagsRefreshed": 12,
      "tagsAdded": 3,
      "maintenanceEnqueued": 0
    }
  },
  "link": {
    "status": "completed",
    "durationMs": 18000,
    "counts": {
      "linkAttempts": 24,
      "edgesCreated": 18
    }
  },
  "tier": {
    "status": "completed",
    "durationMs": 8000,
    "counts": {
      "tierCandidates": 50,
      "tierChanges": 8
    }
  },
  "summarize": {
    "status": "completed",
    "durationMs": 12000,
    "counts": {
      "tierChanges": 8,
      "summariesRegenerated": 5,
      "tierMaintenanceEnqueued": 8
    }
  }
}

Performance Indicators

Healthy Sleep Cycle:

  • Duration: 2-5 minutes
  • Tasks completed: 50-200
  • Tasks failed: 0-5 (< 5% failure rate)
  • All steps status: completed

Degraded Performance:

  • Duration: >10 minutes
  • Tasks failed: >20 (> 10% failure rate)
  • Steps status: failed or skipped

Critical Issues:

  • Status: failed
  • Tasks completed: 0
  • Error message in notes

Common Metrics

MetricGoodWarningCritical
Duration<5min5-10min>10min
Success Rate>95%90-95%<90%
Candidates Processed50-10020-50<20
Edges Created10-305-10<5
Tier Changes5-200-50 (stale)

Manual Trigger

Manually trigger a sleep cycle outside the regular schedule.

Via API

terminal
curl -X POST https://api.kybernesis.ai/scheduler/run \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "x-org-id: YOUR_ORG_ID"

Response:

terminal
{
  "status": "queued",
  "runId": "sleep_run_manual_xyz",
  "message": "Sleep job enqueued. Results available in 3-5 minutes."
}

Via Web Console

  1. Navigate to Sleep Tab: Visit topology page
  2. Click "Run Now": Button in sleep history section
  3. Confirm: Modal confirms manual trigger
  4. Monitor: Status updates in real-time (refresh page)

When to Trigger Manually

Use manual triggers when:

  1. After Bulk Upload: Just uploaded 100 PDFs, want immediate tagging
  2. Before Demo: Ensure knowledge graph is up-to-date
  3. Troubleshooting: Test if sleep agent is working correctly
  4. Force Re-Tag: Changed tagging logic, want to re-process all memories
  5. Connector Sync: Just synced Google Drive, want relationships created immediately

Note: Manual triggers respect local lock—if a cycle is already running, the new trigger will skip.

Troubleshooting Sleep Failures

Symptom: Sleep Cycle Failed

Check:

  1. View sleep history in web console
  2. Look for error message in notes field
  3. Check Render queue worker logs

Common Causes:

  • OpenAI API Rate Limit: Exceeded embedding quota
  • Convex Mutation Timeout: Large batch update took >10s
  • Chroma Connection Error: Chroma service unreachable
  • Redis Queue Full: Too many jobs in queue

Solutions:

  • Rate Limit: Wait 60 seconds, cycle will retry next hour
  • Timeout: Reduce TAG_REFRESH_LIMIT from 20 to 10
  • Connection: Check Chroma service health, restart if needed
  • Queue Full: Scale queue worker instances (1 → 3)

Symptom: Sleep Cycle Skipped

Check:

  1. Look for "skipped" status in sleep history
  2. Check reason field in telemetry

Common Reasons:

  • active_run: Previous cycle still running (normal)
  • local_lock: Another worker already processing (normal in multi-instance setup)

Solutions:

  • Active Run: Wait for previous cycle to complete (5-10min)
  • Local Lock: No action needed, concurrent protection working correctly

Symptom: No Tier Changes

Check:

  1. Verify memories meet tier change criteria
  2. Check lastAccessedAt timestamps
  3. Confirm priority/decay scores are updating

Common Causes:

  • All memories accessed recently (within 3 days)
  • Priority scores artificially high
  • Manual pins preventing demotion

Solutions:

  • Recent Access: Normal—memories stay hot when used
  • High Priority: Review priority calculation logic
  • Manual Pins: Unpin memories that no longer need hot tier

Symptom: Too Many Failed Tasks

Check:

  1. Calculate failure rate: tasksFailed / (tasksCompleted + tasksFailed)
  2. Identify which step has highest failure rate
  3. Review step-specific error logs

Common Causes:

  • Tag Step: OpenAI API errors (network, quota, invalid response)
  • Link Step: Convex mutation failures (validation errors)
  • Tier Step: Race conditions (memory deleted during processing)

Solutions:

  • Tag Step: Add retry logic, fallback to keyword-only tagging
  • Link Step: Validate proposals before mutation, skip invalid edges
  • Tier Step: Check memory existence before tier change, gracefully skip deleted

Symptom: Sleep Cycle Never Starts

Check:

  1. Verify Durable Object scheduler is running
  2. Check scheduler metrics: GET /metrics/scheduler
  3. Review Cloudflare Worker logs

Common Causes:

  • Durable Object not bound in wrangler.toml
  • Scheduler alarm not set
  • Queue worker unreachable from Workers

Solutions:

  • Missing Binding: Add KYBERNESIS_SCHEDULER binding to wrangler.toml
  • No Alarm: Trigger scheduler manually once to initialize
  • Unreachable Queue: Check QUEUE_API_URL environment variable

Performance Impact

User-Facing Impact

During Sleep Cycle:

  • Search Latency: +5-10ms (Chroma busy with embeddings)
  • Ingestion Latency: +10-20ms (queue contention)
  • Memory Updates: May conflict with tier changes (rare)

After Sleep Cycle:

  • Search Quality: +15-25% improvement (better tags, relationships)
  • Storage Cost: -10-30% reduction (archive tier compression)
  • Graph Density: +5-10% more edges (relationship discovery)

System Resource Usage

Queue Worker (during sleep):

  • CPU: 40-70% per worker (LLM calls, graph analysis)
  • Memory: 500MB-1GB (candidate processing, embeddings)
  • Network: 50-100 req/s to OpenAI, Convex, Chroma

Convex (during sleep):

  • Mutations: 100-300 per cycle
  • Queries: 50-100 per cycle
  • Database load: <10% of capacity

Chroma (during sleep):

  • Embeddings added: 0-50 (new relationships)
  • Query load: Minimal (no vector search in sleep agent)

Optimization Tips

  1. Reduce Frequency: Change from 60min to 120min if load is high
  2. Lower Limits: Reduce TAG_REFRESH_LIMIT from 20 to 10
  3. Skip Steps: Disable linking if graph is dense enough
  4. Batch Updates: Combine multiple Convex mutations into single transaction
  5. Parallel Processing: Run tag and tier steps concurrently (future enhancement)

Load Testing Results

Scenario: 10,000 memories, sleep cycle every 60 minutes

MetricBefore SleepDuring SleepAfter Sleep
Hybrid Search (p95)120ms135ms (+12%)110ms (-8%)
Memory List (p95)80ms90ms (+12%)75ms (-6%)
Ingestion (p95)150ms170ms (+13%)145ms (-3%)
Queue Depth5 jobs8 jobs (+60%)3 jobs (-40%)

Conclusion: Slight performance degradation during sleep cycle (10-15% latency increase), but overall improvement in search quality and reduced storage costs outweigh temporary slowdown.


Next Steps

Support