Memory System Deep Dive

This guide provides a comprehensive look at how Kybernesis stores, processes, and manages memories.

Table of Contents


Memory Data Model

Kybernesis uses a multi-table data model to efficiently store and query memories.

Core Tables

Memory Items

The main memory record containing metadata:

terminal
{
  id: "mem_abc123",              // Unique identifier
  orgId: "org_xyz789",           // Organization (multi-tenancy)
  itemType: "document",          // Type of memory
  title: "Q4 Planning Doc",      // Display name
  description: "Strategic...",   // Optional description
  source: "upload",              // Origin (upload/chat/connector)
  sourceRef: "file-key-123",     // Reference to source
  status: "ingested",            // Processing status
  priority: 0.75,                // Importance (0-1)
  decayScore: 0.15,              // Decay metric (0-1)
  tags: ["planning", "2024"],    // Combined tags
  autoTags: ["planning", "q4"],  // AI-generated tags
  manualTags: ["2024"],          // User-added tags
  inferredEntities: ["..."],     // Extracted entities
  tier: "hot",                   // Storage tier
  lastTaggedAt: 1698765432000,   // Last tagging timestamp
  lastRelationshipAuditAt: ...,  // Last linking timestamp
  metadata: { ... },             // Custom metadata
  ingestMetadata: { ... },       // Ingestion details
  contentHash: "sha256:...",     // Deduplication hash
  ingestedAt: 1698765432000,     // Creation timestamp
  updatedAt: 1698765555000       // Last modified timestamp
}

Memory Chunks

Content segments for vector search:

terminal
{
  id: "chunk_def456",
  orgId: "org_xyz789",
  memoryId: "mem_abc123",        // Parent memory
  chunkIndex: 0,                 // Position in document
  layer: "hot",                  // Storage tier
  content: "Q4 objectives...",   // Chunk text
  summary: "Brief overview...",  // Optional summary
  vectorId: "vec_ghi789",        // Chroma vector ID
  hotKey: "hot:mem_abc123:0",    // KV cache key
  embeddingVersion: "v1",        // Model version
  metadata: { ... },             // Chunk-specific metadata
  createdAt: 1698765432000,
  updatedAt: 1698765555000
}

Memory Entities

Knowledge graph nodes:

terminal
{
  id: "ent_jkl012",
  orgId: "org_xyz789",
  name: "Alice Johnson",         // Entity name
  type: "person",                // Entity type
  salience: 0.8,                 // Prominence (0-1)
  embeddingVersion: "v1",
  metadata: { ... },
  createdAt: 1698765432000,
  updatedAt: 1698765555000
}

Memory Edges

Knowledge graph relationships:

terminal
{
  id: "edge_mno345",
  orgId: "org_xyz789",
  fromEntityId: "ent_jkl012",    // Source entity
  toEntityId: "ent_pqr678",      // Target entity
  relation: "works_with",        // Relationship type
  weight: 0.9,                   // Strength (0-1)
  contextChunkId: "chunk_...",   // Where found
  source: "auto",                // Detection method
  confidence: 0.85,              // Certainty (0-1)
  createdByJobId: "job_...",     // Sleep run ID
  lastVerifiedAt: 1698765555000, // Last confirmation
  updatedAt: 1698765555000
}

Memory Summaries

Condensed representations:

terminal
{
  id: "sum_stu901",
  orgId: "org_xyz789",
  memoryId: "mem_abc123",
  chunkId: "chunk_def456",       // Optional chunk reference
  scope: "chunk",                // "chunk" or "memory"
  summary: "Q4 focuses on...",   // Summary text
  model: "gpt-4o-mini",          // AI model used
  tokens: 150,                   // Token count
  metadata: { ... },
  updatedAt: 1698765555000
}

Relationships Between Tables

terminal
Memory Item (1)
  ├── has many → Memory Chunks (N)
  │     └── each has → Vector Embedding in Chroma
  ├── has many → Memory Summaries (N)
  ├── references → Entities via inferredEntities
  └── participates in → Edges via contextChunkId

Entity (1)
  ├── has many outgoing → Edges (N)
  └── has many incoming → Edges (N)

Chunking Strategy

Kybernesis splits large content into manageable chunks for optimal search performance.

Why Chunking?

Benefits:

  • Better vector embeddings (focused, coherent text)
  • Granular retrieval (find specific sections)
  • Efficient processing (parallel chunk encoding)
  • Memory efficiency (load only relevant chunks)

Without chunking:

terminal
Search: "What are the Q4 budget priorities?"
Returns: Entire 50-page document
Problem: You have to read everything to find the answer

With chunking:

terminal
Search: "What are the Q4 budget priorities?"
Returns: Chunk 23 - "Budget Priorities" section (1 page)
Benefit: Precise answer in context

Chunking Rules

Default Parameters

  • Chunk size: 500-1000 characters
  • Overlap: None (clean splits)
  • Boundary: Sentence/paragraph boundaries
  • Index: Sequential (0, 1, 2, ...)

Splitting Logic

The chunking system (@kybernesis/pipeline):

  1. Parse content into sentences
  2. Group sentences until reaching target size
  3. Respect boundaries - never split mid-sentence
  4. Create chunks with metadata:
    • Position in document (chunkIndex)
    • Parent memory ID
    • Original content length

Example Chunking

terminal
Input: 2500-character document

Output:
Chunk 0 (850 chars): Introduction and background...
Chunk 1 (920 chars): Methodology and approach...
Chunk 2 (730 chars): Results and conclusions...

Chunk Metadata

Each chunk preserves:

  • Parent memory title
  • Source information
  • Tag inheritance
  • Timestamp from parent

This ensures chunks are self-contained and searchable independently.


Vector Embeddings

Vector embeddings power semantic search by converting text into numerical representations.

What are Embeddings?

An embedding is a dense vector (array of numbers) that captures the semantic meaning of text.

Example:

terminal
Text: "machine learning models"
Embedding: [0.23, -0.15, 0.78, ..., 0.42]  (1536 dimensions)

Similar meanings → Similar vectors:

terminal
"machine learning" ~ "artificial intelligence" ~ "neural networks"
Distance between vectors: small (0.1)

"machine learning" vs "cooking recipes"
Distance between vectors: large (0.9)

Embedding Generation

Model

  • Default: OpenAI text-embedding-3-small
  • Dimensions: 1536
  • Version tracking: Embedded in chunk metadata

Process

  1. Chunk created during ingestion
  2. Content sent to OpenAI API
  3. Embedding vector returned
  4. Vector stored in Chroma with metadata:
    terminal
    {
      id: "chunk_abc123",
      embedding: [0.23, -0.15, ...],  // 1536-dimensional vector
      metadata: {
        memoryId: "mem_xyz789",
        orgId: "org_123",
        layer: "hot",
        chunkIndex: 0
      }
    }
    

Storage in Chroma

ChromaDB is a specialized vector database that:

  • Stores embeddings efficiently
  • Supports fast similarity search
  • Handles millions of vectors
  • Filters by metadata

Collection structure:

terminal
Collection: "kybernesis_memories"
  ├── Tenant: "default"
  ├── Database: "kybernesis"
  └── Vectors: ~100,000 chunks

Similarity Search

When you query:

  1. Query text → embedding vector
  2. Compare against all chunk vectors
  3. Calculate distances (cosine, euclidean)
  4. Return top N most similar chunks

Distance → Similarity conversion:

terminal
similarity = 1 / (1 + distance)

distance = 0.0 → similarity = 1.0 (perfect match)
distance = 1.0 → similarity = 0.5 (moderate)
distance = 10.0 → similarity = 0.09 (poor)

Embedding Versions

Embeddings are versioned to support:

  • Model upgrades (e.g., v1 → v2)
  • Reprocessing old memories
  • A/B testing different models

Version stored in:

  • memoryChunks.embeddingVersion
  • Chroma metadata

Memory Tiering

Automatic storage tier management optimizes retrieval performance and cost.

Tier Definitions

TierSpeedUse CaseRetention
HotFastestActive, frequently accessedIndefinite
WarmModerateOccasional accessUntil archive criteria met
ArchiveSlowerRarely accessedIndefinite

Tiering Criteria

The tiering system evaluates each memory against specific thresholds:

Hot Tier Qualification (ANY condition)

terminal
// Stay hot if ANY of these are true:
priority >= 0.65                    // High importance
decayScore <= 0.25                  // Low decay
timeSinceAccess <= 3 days           // Recently used
relationshipScore >= 6              // Dense connections
recentEdges >= 4                    // Active connections
isPinned === true                   // User pinned

Reasons for hot tier:

  • high_priority - Priority score ≥ 0.65
  • low_decay - Decay score ≤ 0.25
  • recent_access - Accessed within 3 days
  • dense_graph - Relationship score ≥ 6
  • active_connections - Recent edges ≥ 4
  • manual_pin - User pinned

Warm Tier Qualification (ANY condition)

terminal
// Move to warm if ANY of these are true:
priority >= 0.3                     // Moderate importance
timeSinceAccess <= 21 days          // Recently accessed
relationshipScore >= 3              // Some connections
manualTags.length > 0               // User-tagged

Reasons for warm tier:

  • moderate_priority - Priority ≥ 0.3
  • recently_accessed - Accessed within 21 days
  • connected_graph - Relationship score ≥ 3
  • manual_tags_present - Has user tags

Archive Tier Qualification (ALL conditions)

terminal
// Move to archive if ALL of these are true:
timeSinceAccess >= 30 days          // Stale access
priority < 0.3                      // Low priority
decayScore >= 0.6-0.8               // High decay
relationshipScore <= 2              // Isolated
recentEdges === 0                   // No activity
manualTags.length === 0             // No user tags

Reasons for archive:

  • stale_and_low_priority - 30+ days + low priority
  • cold_and_disconnected - 45+ days + isolated

Tier Transitions

Memories move between tiers during sleep cycles:

terminal
Hot → Warm
  Trigger: Unused for 4+ days AND priority drops below 0.65

Warm → Archive
  Trigger: Unused for 30+ days AND meets all archive criteria

Archive → Warm
  Trigger: New access OR manual tag added

Warm → Hot
  Trigger: Frequent access OR new relationships OR priority increase

Implementation Details

Fields updated:

  • memoryItems.tier - "hot" | "warm" | "archive"
  • memoryChunks.layer - "hot" | "warm" | "archive" (synchronized)

Tier change tracking:

terminal
interface TierChange {
  memoryId: string;
  currentTier: "hot" | "warm" | "archive";
  targetTier: "hot" | "warm" | "archive";
  reason: string;  // e.g., "stale_and_low_priority"
}

Automatic processes:

  1. Sleep agent collects tier change candidates
  2. Evaluates each memory against criteria
  3. Updates tier in database
  4. Optionally refreshes summaries for tier changes

Tagging System

Automatic and manual tagging provides flexible memory organization.

Auto-Tagging Process

Rule-Based Tags

Sources:

  1. Source type - "upload", "chat", "connector"
  2. File extensions - "pdf", "md", "docx"
  3. Connector providers - "google-drive", "notion"
  4. Filename keywords - Extracted from title

Example:

terminal
Input: {
  source: "upload",
  title: "Q4_Budget_Analysis.pdf",
  metadata: { filename: "Q4_Budget_Analysis.pdf" }
}

Auto tags: ["upload", "pdf"]

Content-Based Tags

Keyword extraction:

  1. Summarize content (2 sentences)
  2. Split into tokens
  3. Filter tokens ≥ 4 characters
  4. Take top 6 keywords
  5. Normalize and deduplicate

Example:

terminal
Content: "This document outlines the quarterly budget analysis
          for engineering and product teams..."

Extracted keywords: ["document", "quarterly", "budget", "analysis",
                     "engineering", "product"]

Tag Limits

  • Maximum auto tags: 6 per memory
  • Minimum keyword length: 4 characters
  • Normalization: Lowercase, trim punctuation
  • Deduplication: Case-insensitive unique

Manual Tagging

Users add custom tags through the UI:

Characteristics:

  • Unlimited manual tags
  • Never removed by system
  • Persistent across re-tagging
  • Prevent archival (memories with manual tags stay warm+)

UI indicators:

  • Emerald badge color
  • "Manual" label in tag list
  • Editable/removable

Combined Tags Array

The system maintains three tag arrays:

terminal
{
  autoTags: ["upload", "pdf", "budget"],      // System-generated
  manualTags: ["2025-planning", "important"], // User-added
  tags: ["upload", "pdf", "budget",           // Union of both
          "2025-planning", "important"]
}

Used for:

  • Search filtering
  • Relationship detection
  • Memory clustering
  • Topology visualization

Tag Refresh Schedule

Auto tags regenerate when:

  • Memory not tagged in 7+ days
  • Sleep agent runs tagging step
  • Limited to 20 memories per run (performance)

Process:

  1. Query memories where lastTaggedAt < (now - 7 days)
  2. Limit to 20 oldest
  3. Regenerate auto tags
  4. Merge with existing manual tags
  5. Update tags array
  6. Set lastTaggedAt = now

Relationship Detection

The linking system builds a knowledge graph by detecting relationships between memories.

Detection Strategy

Tag-Based Linking

Memories sharing tags are candidates for relationships:

terminal
Memory A: tags = ["machine-learning", "python", "tensorflow"]
Memory B: tags = ["machine-learning", "neural-networks"]

Shared tags: ["machine-learning"]
→ Relationship proposed

Entity-Based Linking

Memories mentioning the same entities:

terminal
Memory A: inferredEntities = ["Alice", "Product Team", "Q4"]
Memory B: inferredEntities = ["Alice", "Engineering", "Q4"]

Shared entities: ["Alice", "Q4"]
→ Relationship proposed

Relationship Proposal

Each proposal includes:

terminal
{
  fromMemoryId: "mem_abc123",
  toMemoryId: "mem_def456",
  relation: "related_to",        // Relationship type
  weight: 0.75,                  // Strength (0-1)
  confidence: 0.8,               // Certainty (0-1)
  evidence: {
    sharedTags: ["machine-learning"],
    sharedEntities: ["Alice"],
    contextChunk: "chunk_ghi789"
  }
}

Relationship Types

Common relation types:

  • related_to - General connection
  • references - Citation/mention
  • part_of - Hierarchical
  • similar_to - Semantic similarity
  • precedes / follows - Temporal
  • authored_by - Attribution

Edge Weight Calculation

terminal
weight = (
  sharedTagCount * 0.3 +
  sharedEntityCount * 0.4 +
  semanticSimilarity * 0.3
)

// Normalized to 0-1 range

Stale Edge Cleanup

During sleep cycles, low-confidence edges are pruned:

terminal
// Remove edges where:
lastVerifiedAt < (now - 7 days) &&
confidence <= 0.35

This prevents graph clutter from outdated relationships.

Relationship Score

Each memory calculates a relationship score:

terminal
relationshipScore = totalEdgeCount + (recentEdgeCount * 2)

// Used for tier qualification:
relationshipScore >= 6 → hot tier candidate
relationshipScore >= 3 → warm tier candidate

Summaries and Previews

Summaries provide quick context for memories and chunks.

Summary Scopes

Chunk-Level Summaries

  • Scope: Single chunk
  • Length: 1-2 sentences
  • Use: Quick preview in search results

Memory-Level Summaries

  • Scope: Entire memory item
  • Length: 2-3 sentences
  • Use: Card previews, list views

Generation Process

Using @kybernesis/pipeline:

terminal
import { summarize } from '@kybernesis/pipeline';

const summary = summarize(content, {
  maxSentences: 2,      // Limit length
  preserveContext: true // Keep key details
});

Algorithm:

  1. Extract first N sentences
  2. Optionally use LLM for extractive summary
  3. Ensure coherence and completeness
  4. Store with metadata:
    terminal
    {
      summary: "Q4 focuses on...",
      model: "gpt-4o-mini",
      tokens: 150,
      updatedAt: 1698765555000
    }
    

Summary Refresh

Summaries regenerate when:

  • Memory tier changes
  • Content is updated
  • User triggers re-summarization
  • Sleep agent runs summarize step

Display in UI

Search results:

terminal
[Memory Title]
Summary: "Brief 1-2 sentence overview..."
Tags: [tag1] [tag2] [tag3]

Topology nodes:

terminal
┌──────────────────┐
│ Memory Title     │
│ ──────────────── │
│ Summary text...  │
└──────────────────┘

Decay Scoring

Decay measures how "stale" a memory has become over time.

Decay Formula

terminal
decayScore = calculateDecay({
  lastAccessedAt: 1698000000000,
  createdAt: 1690000000000,
  accessCount: 5,
  now: Date.now()
});

// Factors:
// - Time since last access (exponential)
// - Time since creation (linear)
// - Access frequency (inverse)

// Range: 0.0 (fresh) to 1.0 (stale)

Decay Thresholds

terminal
decayScore <= 0.25Hot tier candidate
decayScore >= 0.60Archive candidate (soft)
decayScore >= 0.80Archive candidate (hard)

Decay Adjustment

Access resets decay:

terminal
Before access: decayScore = 0.8 (very stale)
After access:  decayScore = 0.1 (fresh)

Relationships slow decay:

terminal
Isolated memory: decayScore increases quickly
Connected memory: decayScore increases slowly

Priority Calculation

Priority represents memory importance.

Priority Factors

terminal
priority = weighted_average([
  userExplicitPriority * 0.4,    // Manual priority setting
  accessFrequency * 0.2,         // How often accessed
  relationshipDensity * 0.2,     // Knowledge graph centrality
  recency * 0.1,                 // How recent
  sourceImportance * 0.1         // Source type weight
]);

// Range: 0.0 (low) to 1.0 (high)

Priority Updates

Priority recalculates when:

  • Memory is accessed
  • Relationships change
  • User sets manual priority
  • Sleep agent runs tier step

Priority Thresholds

terminal
priority >= 0.65Hot tier candidate
priority >= 0.30Warm tier candidate
priority < 0.30Archive candidate

Troubleshooting

Issue: Memories Not Being Tagged

Symptoms:

  • autoTags array is empty
  • Tags not appearing in UI

Diagnosis:

  1. Check lastTaggedAt timestamp
  2. Verify sleep agent is running
  3. Check sleep run logs for tagging step

Solutions:

  • Wait for next sleep cycle (every 60 minutes)
  • Manually trigger sleep run
  • Check content exists (empty content = no tags)
  • Verify OpenAI API key is configured

Issue: All Memories Stuck in Hot Tier

Symptoms:

  • No memories moving to warm/archive
  • Tier distribution heavily skewed

Diagnosis:

  1. Check lastAccessedAt timestamps
  2. Review access patterns (frequent access?)
  3. Verify sleep agent tier step is running

Solutions:

  • Recent access (< 3 days) keeps memories hot
  • Wait for access window to expire
  • Check for automated processes accessing memories
  • Verify tier change criteria are appropriate

Issue: Relationships Not Detected

Symptoms:

  • Knowledge graph is sparse
  • No edges between memories

Diagnosis:

  1. Check if memories share tags
  2. Verify lastRelationshipAuditAt timestamp
  3. Review sleep run logs for linking step

Solutions:

  • Add more tags to memories
  • Wait for sleep cycle to run linking
  • Ensure memories have overlapping entities
  • Check confidence threshold configuration

Issue: Poor Search Results

Symptoms:

  • Relevant memories not returned
  • Low similarity scores

Diagnosis:

  1. Check if memories are chunked
  2. Verify embeddings exist in Chroma
  3. Test both vector and metadata search separately

Solutions:

  • Re-ingest memories to regenerate embeddings
  • Check embedding model version consistency
  • Try different query phrasing
  • Use tag filters to narrow results
  • Verify Chroma collection is accessible

Issue: Summaries Missing

Symptoms:

  • No preview text in search results
  • Summary fields are null

Diagnosis:

  1. Check memorySummaries table
  2. Verify sleep agent summarize step runs
  3. Check OpenAI API quotas

Solutions:

  • Trigger sleep cycle manually
  • Verify API key has credits
  • Check for errors in sleep run logs
  • Re-summarize specific memories via API

Cognitive Memory Layer

Kybernesis goes beyond storage and retrieval with a cognitive layer that reasons about your knowledge.

Source Confidence Weighting

Not all data sources are equal. Every fact's confidence is weighted by its source trust level:

SourceWeightExample
User correction1.00"That's wrong, it's actually..."
User input0.95Direct statement in chat
Agent chat0.80Chat-sourced data
Connector sync0.70Google Drive, Notion auto-sync
AI extraction0.60LLM-extracted from content

Final confidence = LLM confidence * source weight. User-provided information consistently outranks AI-extracted data.

Fact Correction

When a fact is wrong, you can correct it:

  • Old fact is superseded (marked isLatest: false)
  • New fact created at confidence 1.0 with sourceType: user_correction
  • Contradiction recorded as user_resolved
  • User corrections never decay

Contradictions

When facts conflict, the system creates a contradiction record instead of silently picking a winner:

  • Auto-resolved: If confidence gap >= 0.3, higher confidence fact wins
  • Pending: If gap < 0.3, both facts kept and conflict surfaced to users/agents for review
  • Rate-limited to 20 per sleep cycle with deduplication

Confidence Decay

Facts that go unreinforced lose confidence over time:

  • Decay rate: 2% per unreinforced week
  • Floor: Confidence never drops below 0.3
  • Corroboration: Same fact from multiple sources gets a confidence boost
  • Exempt: User corrections never decay

Surprisal Scoring

Each fact gets a novelty score (0-1) based on Jaccard distance from existing knowledge:

  • 0.0-0.2: Near-duplicate
  • 0.5-0.7: Partially novel
  • 0.7-1.0: Highly novel (prioritized in reasoning and retrieval)

Reasoning Engine

The sleep agent derives new understanding from accumulated facts:

  • Deductions (confidence 0.80-0.90): Logically certain conclusions from 2+ facts
  • Inductions (confidence 0.60-0.75): Probable patterns from 3+ data points
  • Processes top 5 entities per cycle, max 5 insights per entity

Narrative Profiles

Entities with 5+ facts get LLM-generated prose summaries:

"Ian Borders is the founder and CEO of Kybernesis, an AI memory platform for agents. Based in San Francisco, he brings over 8 years of AI experience and a background in computer science."

Ingestion Validation

Noise entities are filtered before entering the knowledge graph: single characters, pure numbers, stop words, speaker labels, and OCR artifacts.


Next Steps