Memory System Deep Dive

This guide provides a comprehensive look at how Kybernesis stores, processes, and manages memories.

Table of Contents


Memory Data Model

Kybernesis uses a multi-table data model to efficiently store and query memories.

Core Tables

Memory Items

The main memory record containing metadata:

terminal
{
  id: "mem_abc123",              // Unique identifier
  orgId: "org_xyz789",           // Organization (multi-tenancy)
  itemType: "document",          // Type of memory
  title: "Q4 Planning Doc",      // Display name
  description: "Strategic...",   // Optional description
  source: "upload",              // Origin (upload/chat/connector)
  sourceRef: "file-key-123",     // Reference to source
  status: "ingested",            // Processing status
  priority: 0.75,                // Importance (0-1)
  decayScore: 0.15,              // Decay metric (0-1)
  tags: ["planning", "2024"],    // Combined tags
  autoTags: ["planning", "q4"],  // AI-generated tags
  manualTags: ["2024"],          // User-added tags
  inferredEntities: ["..."],     // Extracted entities
  tier: "hot",                   // Storage tier
  lastTaggedAt: 1698765432000,   // Last tagging timestamp
  lastRelationshipAuditAt: ...,  // Last linking timestamp
  metadata: { ... },             // Custom metadata
  ingestMetadata: { ... },       // Ingestion details
  contentHash: "sha256:...",     // Deduplication hash
  ingestedAt: 1698765432000,     // Creation timestamp
  updatedAt: 1698765555000       // Last modified timestamp
}

Memory Chunks

Content segments for vector search:

terminal
{
  id: "chunk_def456",
  orgId: "org_xyz789",
  memoryId: "mem_abc123",        // Parent memory
  chunkIndex: 0,                 // Position in document
  layer: "hot",                  // Storage tier
  content: "Q4 objectives...",   // Chunk text
  summary: "Brief overview...",  // Optional summary
  vectorId: "vec_ghi789",        // Chroma vector ID
  hotKey: "hot:mem_abc123:0",    // KV cache key
  embeddingVersion: "v1",        // Model version
  metadata: { ... },             // Chunk-specific metadata
  createdAt: 1698765432000,
  updatedAt: 1698765555000
}

Memory Entities

Knowledge graph nodes:

terminal
{
  id: "ent_jkl012",
  orgId: "org_xyz789",
  name: "Alice Johnson",         // Entity name
  type: "person",                // Entity type
  salience: 0.8,                 // Prominence (0-1)
  embeddingVersion: "v1",
  metadata: { ... },
  createdAt: 1698765432000,
  updatedAt: 1698765555000
}

Memory Edges

Knowledge graph relationships:

terminal
{
  id: "edge_mno345",
  orgId: "org_xyz789",
  fromEntityId: "ent_jkl012",    // Source entity
  toEntityId: "ent_pqr678",      // Target entity
  relation: "works_with",        // Relationship type
  weight: 0.9,                   // Strength (0-1)
  contextChunkId: "chunk_...",   // Where found
  source: "auto",                // Detection method
  confidence: 0.85,              // Certainty (0-1)
  createdByJobId: "job_...",     // Sleep run ID
  lastVerifiedAt: 1698765555000, // Last confirmation
  updatedAt: 1698765555000
}

Memory Summaries

Condensed representations:

terminal
{
  id: "sum_stu901",
  orgId: "org_xyz789",
  memoryId: "mem_abc123",
  chunkId: "chunk_def456",       // Optional chunk reference
  scope: "chunk",                // "chunk" or "memory"
  summary: "Q4 focuses on...",   // Summary text
  model: "gpt-4o-mini",          // AI model used
  tokens: 150,                   // Token count
  metadata: { ... },
  updatedAt: 1698765555000
}

Relationships Between Tables

terminal
Memory Item (1)
  ├── has many → Memory Chunks (N)
  │     └── each has → Vector Embedding in Chroma
  ├── has many → Memory Summaries (N)
  ├── references → Entities via inferredEntities
  └── participates in → Edges via contextChunkId

Entity (1)
  ├── has many outgoing → Edges (N)
  └── has many incoming → Edges (N)

Chunking Strategy

Kybernesis splits large content into manageable chunks for optimal search performance.

Why Chunking?

Benefits:

  • Better vector embeddings (focused, coherent text)
  • Granular retrieval (find specific sections)
  • Efficient processing (parallel chunk encoding)
  • Memory efficiency (load only relevant chunks)

Without chunking:

terminal
Search: "What are the Q4 budget priorities?"
Returns: Entire 50-page document
Problem: You have to read everything to find the answer

With chunking:

terminal
Search: "What are the Q4 budget priorities?"
Returns: Chunk 23 - "Budget Priorities" section (1 page)
Benefit: Precise answer in context

Chunking Rules

Default Parameters

  • Chunk size: 800-1200 characters (varies by ingestion type)
    • Chat messages: 800 characters
    • Uploads (advanced): 1200 characters
    • Connector sync: 1200 characters
  • Overlap: 80-120 characters (prevents context loss)
    • Chat messages: 80 characters
    • Uploads: 120 characters
    • Connector sync: 120 characters
  • Boundary: Sentence/paragraph boundaries (Markdown-aware for uploads)
  • Index: Sequential (0, 1, 2, ...)

Splitting Logic

The chunking system (@kybernesis/pipeline):

  1. Parse content into sentences
  2. Group sentences until reaching target size
  3. Respect boundaries - never split mid-sentence
  4. Create chunks with metadata:
    • Position in document (chunkIndex)
    • Parent memory ID
    • Original content length

Example Chunking

terminal
Input: 3000-character document

Output:
Chunk 0 (1200 chars): Introduction and background...
Chunk 1 (1200 chars, starts at 1080): Methodology and approach...
Chunk 2 (720 chars, starts at 2160): Results and conclusions...

Note: 120-char overlap between chunks for continuity

Chunk Metadata

Each chunk preserves:

  • Parent memory title
  • Source information
  • Tag inheritance
  • Timestamp from parent

This ensures chunks are self-contained and searchable independently.


Vector Embeddings

Vector embeddings power semantic search by converting text into numerical representations.

What are Embeddings?

An embedding is a dense vector (array of numbers) that captures the semantic meaning of text.

Example:

terminal
Text: "machine learning models"
Embedding: [0.23, -0.15, 0.78, ..., 0.42]  (1536 dimensions)

Similar meanings → Similar vectors:

terminal
"machine learning" ~ "artificial intelligence" ~ "neural networks"
Distance between vectors: small (0.1)

"machine learning" vs "cooking recipes"
Distance between vectors: large (0.9)

Embedding Generation

Model

  • Default: OpenAI text-embedding-3-small
  • Dimensions: 1536
  • Version tracking: Embedded in chunk metadata

Process

  1. Chunk created during ingestion
  2. Content sent to OpenAI API
  3. Embedding vector returned
  4. Vector stored in Chroma with metadata:
    terminal
    {
      id: "chunk_abc123",
      embedding: [0.23, -0.15, ...],  // 1536-dimensional vector
      metadata: {
        memoryId: "mem_xyz789",
        orgId: "org_123",
        layer: "hot",
        chunkIndex: 0
      }
    }
    

Storage in Chroma

ChromaDB is a specialized vector database that:

  • Stores embeddings efficiently
  • Supports fast similarity search
  • Handles millions of vectors
  • Filters by metadata

Collection structure:

terminal
Collection: "kybernesis_memories"
  ├── Tenant: "default"
  ├── Database: "kybernesis"
  └── Vectors: ~100,000 chunks

Similarity Search

When you query:

  1. Query text → embedding vector
  2. Compare against all chunk vectors
  3. Calculate distances (cosine, euclidean)
  4. Return top N most similar chunks

Distance → Similarity conversion:

terminal
similarity = 1 / (1 + distance)

distance = 0.0 → similarity = 1.0 (perfect match)
distance = 1.0 → similarity = 0.5 (moderate)
distance = 10.0 → similarity = 0.09 (poor)

Embedding Versions

Embeddings are versioned to support:

  • Model upgrades (e.g., v1 → v2)
  • Reprocessing old memories
  • A/B testing different models

Version stored in:

  • memoryChunks.embeddingVersion
  • Chroma metadata

Memory Tiering

Automatic storage tier management optimizes retrieval performance and cost.

Tier Definitions

TierSpeedUse CaseRetention
HotFastestActive, frequently accessedIndefinite
WarmModerateOccasional accessUntil archive criteria met
ArchiveSlowerRarely accessedIndefinite

Tiering Criteria

The tiering system evaluates each memory against specific thresholds:

Hot Tier Qualification (ANY condition)

terminal
// Stay hot if ANY of these are true:
priority >= 0.65                    // High importance
decayScore <= 0.25                  // Low decay
timeSinceAccess <= 3 days           // Recently used
relationshipScore >= 6              // Dense connections
recentEdges >= 4                    // Active connections
isPinned === true                   // User pinned

Reasons for hot tier:

  • high_priority - Priority score ≥ 0.65
  • low_decay - Decay score ≤ 0.25
  • recent_access - Accessed within 3 days
  • dense_graph - Relationship score ≥ 6
  • active_connections - Recent edges ≥ 4
  • manual_pin - User pinned

Warm Tier Qualification (ANY condition)

terminal
// Move to warm if ANY of these are true:
priority >= 0.3                     // Moderate importance
timeSinceAccess <= 21 days          // Recently accessed
relationshipScore >= 3              // Some connections
manualTags.length > 0               // User-tagged

Reasons for warm tier:

  • moderate_priority - Priority ≥ 0.3
  • recently_accessed - Accessed within 21 days
  • connected_graph - Relationship score ≥ 3
  • manual_tags_present - Has user tags

Archive Tier Qualification (ALL conditions)

terminal
// Move to archive if ALL of these are true:
timeSinceAccess >= 30 days          // Stale access
priority < 0.3                      // Low priority
decayScore >= 0.6-0.8               // High decay
relationshipScore <= 2              // Isolated
recentEdges === 0                   // No activity
manualTags.length === 0             // No user tags

Reasons for archive:

  • stale_and_low_priority - 30+ days + low priority
  • cold_and_disconnected - 45+ days + isolated

Tier Transitions

Memories move between tiers during sleep cycles:

terminal
Hot → Warm
  Trigger: Unused for 4+ days AND priority drops below 0.65

Warm → Archive
  Trigger: Unused for 30+ days AND meets all archive criteria

Archive → Warm
  Trigger: New access OR manual tag added

Warm → Hot
  Trigger: Frequent access OR new relationships OR priority increase

Implementation Details

Fields updated:

  • memoryItems.tier - "hot" | "warm" | "archive"
  • memoryChunks.layer - "hot" | "warm" | "archive" (synchronized)

Tier change tracking:

terminal
interface TierChange {
  memoryId: string;
  currentTier: "hot" | "warm" | "archive";
  targetTier: "hot" | "warm" | "archive";
  reason: string;  // e.g., "stale_and_low_priority"
}

Automatic processes:

  1. Sleep agent collects tier change candidates
  2. Evaluates each memory against criteria
  3. Updates tier in database
  4. Optionally refreshes summaries for tier changes

Tagging System

Automatic and manual tagging provides flexible memory organization.

Auto-Tagging Process

Rule-Based Tags

Sources:

  1. Source type - "upload", "chat", "connector"
  2. File extensions - "pdf", "md", "docx"
  3. Connector providers - "google-drive", "notion"
  4. Filename keywords - Extracted from title

Example:

terminal
Input: {
  source: "upload",
  title: "Q4_Budget_Analysis.pdf",
  metadata: { filename: "Q4_Budget_Analysis.pdf" }
}

Auto tags: ["upload", "pdf"]

Content-Based Tags

Keyword extraction:

  1. Summarize content (2 sentences)
  2. Split into tokens
  3. Filter tokens ≥ 4 characters
  4. Take top 6 keywords
  5. Normalize and deduplicate

Example:

terminal
Content: "This document outlines the quarterly budget analysis
          for engineering and product teams..."

Extracted keywords: ["document", "quarterly", "budget", "analysis",
                     "engineering", "product"]

Tag Limits

  • Maximum auto tags: 6 per memory
  • Minimum keyword length: 4 characters
  • Normalization: Lowercase, trim punctuation
  • Deduplication: Case-insensitive unique

Manual Tagging

Users add custom tags through the UI:

Characteristics:

  • Unlimited manual tags
  • Never removed by system
  • Persistent across re-tagging
  • Prevent archival (memories with manual tags stay warm+)

UI indicators:

  • Emerald badge color
  • "Manual" label in tag list
  • Editable/removable

Combined Tags Array

The system maintains three tag arrays:

terminal
{
  autoTags: ["upload", "pdf", "budget"],      // System-generated
  manualTags: ["2025-planning", "important"], // User-added
  tags: ["upload", "pdf", "budget",           // Union of both
          "2025-planning", "important"]
}

Used for:

  • Search filtering
  • Relationship detection
  • Memory clustering
  • Topology visualization

Tag Refresh Schedule

Auto tags regenerate when:

  • Memory not tagged in 7+ days
  • Sleep agent runs tagging step
  • Limited to 20 memories per run (performance)

Process:

  1. Query memories where lastTaggedAt < (now - 7 days)
  2. Limit to 20 oldest
  3. Regenerate auto tags
  4. Merge with existing manual tags
  5. Update tags array
  6. Set lastTaggedAt = now

Relationship Detection

The linking system builds a knowledge graph by detecting relationships between memories.

Detection Strategy

Tag-Based Linking

Memories sharing tags are candidates for relationships:

terminal
Memory A: tags = ["machine-learning", "python", "tensorflow"]
Memory B: tags = ["machine-learning", "neural-networks"]

Shared tags: ["machine-learning"]
→ Relationship proposed

Entity-Based Linking

Memories mentioning the same entities:

terminal
Memory A: inferredEntities = ["Alice", "Product Team", "Q4"]
Memory B: inferredEntities = ["Alice", "Engineering", "Q4"]

Shared entities: ["Alice", "Q4"]
→ Relationship proposed

Relationship Proposal

Each proposal includes:

terminal
{
  fromMemoryId: "mem_abc123",
  toMemoryId: "mem_def456",
  relation: "related_to",        // Relationship type
  weight: 0.75,                  // Strength (0-1)
  confidence: 0.8,               // Certainty (0-1)
  evidence: {
    sharedTags: ["machine-learning"],
    sharedEntities: ["Alice"],
    contextChunk: "chunk_ghi789"
  }
}

Relationship Types

Common relation types:

  • related_to - General connection
  • references - Citation/mention
  • part_of - Hierarchical
  • similar_to - Semantic similarity
  • precedes / follows - Temporal
  • authored_by - Attribution

Edge Weight Calculation

terminal
weight = (
  sharedTagCount * 0.3 +
  sharedEntityCount * 0.4 +
  semanticSimilarity * 0.3
)

// Normalized to 0-1 range

Stale Edge Cleanup

During sleep cycles, low-confidence edges are pruned:

terminal
// Remove edges where:
lastVerifiedAt < (now - 7 days) &&
confidence <= 0.35

This prevents graph clutter from outdated relationships.

Relationship Score

Each memory calculates a relationship score:

terminal
relationshipScore = totalEdgeCount + (recentEdgeCount * 2)

// Used for tier qualification:
relationshipScore >= 6 → hot tier candidate
relationshipScore >= 3 → warm tier candidate

Summaries and Previews

Summaries provide quick context for memories and chunks.

Summary Scopes

Chunk-Level Summaries

  • Scope: Single chunk
  • Length: 1-2 sentences
  • Use: Quick preview in search results

Memory-Level Summaries

  • Scope: Entire memory item
  • Length: 2-3 sentences
  • Use: Card previews, list views

Generation Process

Using @kybernesis/pipeline:

terminal
import { summarize } from '@kybernesis/pipeline';

const summary = summarize(content, {
  maxSentences: 2,      // Limit length
  preserveContext: true // Keep key details
});

Algorithm:

  1. Split content by sentence boundaries (periods, exclamation points, question marks)
  2. Extract first N sentences (default: 3)
  3. Join sentences with spaces
  4. Store with metadata:
    terminal
    {
      summary: "Q4 focuses on revenue growth. Engineering priorities include...",
      // Note: Current implementation uses extractive summarization
      // (first N sentences), not LLM-based summarization
      updatedAt: 1698765555000
    }
    

Summary Refresh

Summaries regenerate when:

  • Memory tier changes
  • Content is updated
  • User triggers re-summarization
  • Sleep agent runs summarize step

Display in UI

Search results:

terminal
[Memory Title]
Summary: "Brief 1-2 sentence overview..."
Tags: [tag1] [tag2] [tag3]

Topology nodes:

terminal
┌──────────────────┐
│ Memory Title     │
│ ──────────────── │
│ Summary text...  │
└──────────────────┘

Decay Scoring

Decay measures how "stale" a memory has become over time.

Decay Formula

terminal
decayScore = calculateDecay({
  lastAccessedAt: 1698000000000,
  createdAt: 1690000000000,
  accessCount: 5,
  now: Date.now()
});

// Factors:
// - Time since last access (exponential)
// - Time since creation (linear)
// - Access frequency (inverse)

// Range: 0.0 (fresh) to 1.0 (stale)

Decay Thresholds

terminal
decayScore <= 0.25Hot tier candidate
decayScore >= 0.60Archive candidate (soft)
decayScore >= 0.80Archive candidate (hard)

Decay Adjustment

Access resets decay:

terminal
Before access: decayScore = 0.8 (very stale)
After access:  decayScore = 0.1 (fresh)

Relationships slow decay:

terminal
Isolated memory: decayScore increases quickly
Connected memory: decayScore increases slowly

Priority Calculation

Priority represents memory importance.

Priority Factors

terminal
priority = weighted_average([
  userExplicitPriority * 0.4,    // Manual priority setting
  accessFrequency * 0.2,         // How often accessed
  relationshipDensity * 0.2,     // Knowledge graph centrality
  recency * 0.1,                 // How recent
  sourceImportance * 0.1         // Source type weight
]);

// Range: 0.0 (low) to 1.0 (high)

Priority Updates

Priority recalculates when:

  • Memory is accessed
  • Relationships change
  • User sets manual priority
  • Sleep agent runs tier step

Priority Thresholds

terminal
priority >= 0.65Hot tier candidate
priority >= 0.30Warm tier candidate
priority < 0.30Archive candidate

Troubleshooting

Issue: Memories Not Being Tagged

Symptoms:

  • autoTags array is empty
  • Tags not appearing in UI

Diagnosis:

  1. Check lastTaggedAt timestamp
  2. Verify sleep agent is running
  3. Check sleep run logs for tagging step

Solutions:

  • Wait for next sleep cycle (every 60 minutes)
  • Manually trigger sleep run
  • Check content exists (empty content = no tags)
  • Verify OpenAI API key is configured

Issue: All Memories Stuck in Hot Tier

Symptoms:

  • No memories moving to warm/archive
  • Tier distribution heavily skewed

Diagnosis:

  1. Check lastAccessedAt timestamps
  2. Review access patterns (frequent access?)
  3. Verify sleep agent tier step is running

Solutions:

  • Recent access (< 3 days) keeps memories hot
  • Wait for access window to expire
  • Check for automated processes accessing memories
  • Verify tier change criteria are appropriate

Issue: Relationships Not Detected

Symptoms:

  • Knowledge graph is sparse
  • No edges between memories

Diagnosis:

  1. Check if memories share tags
  2. Verify lastRelationshipAuditAt timestamp
  3. Review sleep run logs for linking step

Solutions:

  • Add more tags to memories
  • Wait for sleep cycle to run linking
  • Ensure memories have overlapping entities
  • Check confidence threshold configuration

Issue: Poor Search Results

Symptoms:

  • Relevant memories not returned
  • Low similarity scores

Diagnosis:

  1. Check if memories are chunked
  2. Verify embeddings exist in Chroma
  3. Test both vector and metadata search separately

Solutions:

  • Re-ingest memories to regenerate embeddings
  • Check embedding model version consistency
  • Try different query phrasing
  • Use tag filters to narrow results
  • Verify Chroma collection is accessible

Issue: Summaries Missing

Symptoms:

  • No preview text in search results
  • Summary fields are null

Diagnosis:

  1. Check memorySummaries table
  2. Verify sleep agent summarize step runs
  3. Check OpenAI API quotas

Solutions:

  • Trigger sleep cycle manually
  • Verify API key has credits
  • Check for errors in sleep run logs
  • Re-summarize specific memories via API

Next Steps