Memory System Deep Dive
This guide provides a comprehensive look at how Kybernesis stores, processes, and manages memories.
Table of Contents
- Memory Data Model
- Chunking Strategy
- Vector Embeddings
- Memory Tiering
- Tagging System
- Relationship Detection
- Summaries and Previews
- Decay Scoring
- Priority Calculation
- Troubleshooting
Memory Data Model
Kybernesis uses a multi-table data model to efficiently store and query memories.
Core Tables
Memory Items
The main memory record containing metadata:
{
id: "mem_abc123", // Unique identifier
orgId: "org_xyz789", // Organization (multi-tenancy)
itemType: "document", // Type of memory
title: "Q4 Planning Doc", // Display name
description: "Strategic...", // Optional description
source: "upload", // Origin (upload/chat/connector)
sourceRef: "file-key-123", // Reference to source
status: "ingested", // Processing status
priority: 0.75, // Importance (0-1)
decayScore: 0.15, // Decay metric (0-1)
tags: ["planning", "2024"], // Combined tags
autoTags: ["planning", "q4"], // AI-generated tags
manualTags: ["2024"], // User-added tags
inferredEntities: ["..."], // Extracted entities
tier: "hot", // Storage tier
lastTaggedAt: 1698765432000, // Last tagging timestamp
lastRelationshipAuditAt: ..., // Last linking timestamp
metadata: { ... }, // Custom metadata
ingestMetadata: { ... }, // Ingestion details
contentHash: "sha256:...", // Deduplication hash
ingestedAt: 1698765432000, // Creation timestamp
updatedAt: 1698765555000 // Last modified timestamp
}
Memory Chunks
Content segments for vector search:
{
id: "chunk_def456",
orgId: "org_xyz789",
memoryId: "mem_abc123", // Parent memory
chunkIndex: 0, // Position in document
layer: "hot", // Storage tier
content: "Q4 objectives...", // Chunk text
summary: "Brief overview...", // Optional summary
vectorId: "vec_ghi789", // Chroma vector ID
hotKey: "hot:mem_abc123:0", // KV cache key
embeddingVersion: "v1", // Model version
metadata: { ... }, // Chunk-specific metadata
createdAt: 1698765432000,
updatedAt: 1698765555000
}
Memory Entities
Knowledge graph nodes:
{
id: "ent_jkl012",
orgId: "org_xyz789",
name: "Alice Johnson", // Entity name
type: "person", // Entity type
salience: 0.8, // Prominence (0-1)
embeddingVersion: "v1",
metadata: { ... },
createdAt: 1698765432000,
updatedAt: 1698765555000
}
Memory Edges
Knowledge graph relationships:
{
id: "edge_mno345",
orgId: "org_xyz789",
fromEntityId: "ent_jkl012", // Source entity
toEntityId: "ent_pqr678", // Target entity
relation: "works_with", // Relationship type
weight: 0.9, // Strength (0-1)
contextChunkId: "chunk_...", // Where found
source: "auto", // Detection method
confidence: 0.85, // Certainty (0-1)
createdByJobId: "job_...", // Sleep run ID
lastVerifiedAt: 1698765555000, // Last confirmation
updatedAt: 1698765555000
}
Memory Summaries
Condensed representations:
{
id: "sum_stu901",
orgId: "org_xyz789",
memoryId: "mem_abc123",
chunkId: "chunk_def456", // Optional chunk reference
scope: "chunk", // "chunk" or "memory"
summary: "Q4 focuses on...", // Summary text
model: "gpt-4o-mini", // AI model used
tokens: 150, // Token count
metadata: { ... },
updatedAt: 1698765555000
}
Relationships Between Tables
Memory Item (1)
├── has many → Memory Chunks (N)
│ └── each has → Vector Embedding in Chroma
├── has many → Memory Summaries (N)
├── references → Entities via inferredEntities
└── participates in → Edges via contextChunkId
Entity (1)
├── has many outgoing → Edges (N)
└── has many incoming → Edges (N)
Chunking Strategy
Kybernesis splits large content into manageable chunks for optimal search performance.
Why Chunking?
Benefits:
- Better vector embeddings (focused, coherent text)
- Granular retrieval (find specific sections)
- Efficient processing (parallel chunk encoding)
- Memory efficiency (load only relevant chunks)
Without chunking:
Search: "What are the Q4 budget priorities?"
Returns: Entire 50-page document
Problem: You have to read everything to find the answer
With chunking:
Search: "What are the Q4 budget priorities?"
Returns: Chunk 23 - "Budget Priorities" section (1 page)
Benefit: Precise answer in context
Chunking Rules
Default Parameters
- Chunk size: 800-1200 characters (varies by ingestion type)
- Chat messages: 800 characters
- Uploads (advanced): 1200 characters
- Connector sync: 1200 characters
- Overlap: 80-120 characters (prevents context loss)
- Chat messages: 80 characters
- Uploads: 120 characters
- Connector sync: 120 characters
- Boundary: Sentence/paragraph boundaries (Markdown-aware for uploads)
- Index: Sequential (0, 1, 2, ...)
Splitting Logic
The chunking system (@kybernesis/pipeline):
- Parse content into sentences
- Group sentences until reaching target size
- Respect boundaries - never split mid-sentence
- Create chunks with metadata:
- Position in document (chunkIndex)
- Parent memory ID
- Original content length
Example Chunking
Input: 3000-character document
Output:
Chunk 0 (1200 chars): Introduction and background...
Chunk 1 (1200 chars, starts at 1080): Methodology and approach...
Chunk 2 (720 chars, starts at 2160): Results and conclusions...
Note: 120-char overlap between chunks for continuity
Chunk Metadata
Each chunk preserves:
- Parent memory title
- Source information
- Tag inheritance
- Timestamp from parent
This ensures chunks are self-contained and searchable independently.
Vector Embeddings
Vector embeddings power semantic search by converting text into numerical representations.
What are Embeddings?
An embedding is a dense vector (array of numbers) that captures the semantic meaning of text.
Example:
Text: "machine learning models"
Embedding: [0.23, -0.15, 0.78, ..., 0.42] (1536 dimensions)
Similar meanings → Similar vectors:
"machine learning" ~ "artificial intelligence" ~ "neural networks"
Distance between vectors: small (0.1)
"machine learning" vs "cooking recipes"
Distance between vectors: large (0.9)
Embedding Generation
Model
- Default: OpenAI text-embedding-3-small
- Dimensions: 1536
- Version tracking: Embedded in chunk metadata
Process
- Chunk created during ingestion
- Content sent to OpenAI API
- Embedding vector returned
- Vector stored in Chroma with metadata:
terminal
{ id: "chunk_abc123", embedding: [0.23, -0.15, ...], // 1536-dimensional vector metadata: { memoryId: "mem_xyz789", orgId: "org_123", layer: "hot", chunkIndex: 0 } }
Storage in Chroma
ChromaDB is a specialized vector database that:
- Stores embeddings efficiently
- Supports fast similarity search
- Handles millions of vectors
- Filters by metadata
Collection structure:
Collection: "kybernesis_memories"
├── Tenant: "default"
├── Database: "kybernesis"
└── Vectors: ~100,000 chunks
Similarity Search
When you query:
- Query text → embedding vector
- Compare against all chunk vectors
- Calculate distances (cosine, euclidean)
- Return top N most similar chunks
Distance → Similarity conversion:
similarity = 1 / (1 + distance)
distance = 0.0 → similarity = 1.0 (perfect match)
distance = 1.0 → similarity = 0.5 (moderate)
distance = 10.0 → similarity = 0.09 (poor)
Embedding Versions
Embeddings are versioned to support:
- Model upgrades (e.g., v1 → v2)
- Reprocessing old memories
- A/B testing different models
Version stored in:
memoryChunks.embeddingVersion- Chroma metadata
Memory Tiering
Automatic storage tier management optimizes retrieval performance and cost.
Tier Definitions
| Tier | Speed | Use Case | Retention |
|---|---|---|---|
| Hot | Fastest | Active, frequently accessed | Indefinite |
| Warm | Moderate | Occasional access | Until archive criteria met |
| Archive | Slower | Rarely accessed | Indefinite |
Tiering Criteria
The tiering system evaluates each memory against specific thresholds:
Hot Tier Qualification (ANY condition)
// Stay hot if ANY of these are true:
priority >= 0.65 // High importance
decayScore <= 0.25 // Low decay
timeSinceAccess <= 3 days // Recently used
relationshipScore >= 6 // Dense connections
recentEdges >= 4 // Active connections
isPinned === true // User pinned
Reasons for hot tier:
high_priority- Priority score ≥ 0.65low_decay- Decay score ≤ 0.25recent_access- Accessed within 3 daysdense_graph- Relationship score ≥ 6active_connections- Recent edges ≥ 4manual_pin- User pinned
Warm Tier Qualification (ANY condition)
// Move to warm if ANY of these are true:
priority >= 0.3 // Moderate importance
timeSinceAccess <= 21 days // Recently accessed
relationshipScore >= 3 // Some connections
manualTags.length > 0 // User-tagged
Reasons for warm tier:
moderate_priority- Priority ≥ 0.3recently_accessed- Accessed within 21 daysconnected_graph- Relationship score ≥ 3manual_tags_present- Has user tags
Archive Tier Qualification (ALL conditions)
// Move to archive if ALL of these are true:
timeSinceAccess >= 30 days // Stale access
priority < 0.3 // Low priority
decayScore >= 0.6-0.8 // High decay
relationshipScore <= 2 // Isolated
recentEdges === 0 // No activity
manualTags.length === 0 // No user tags
Reasons for archive:
stale_and_low_priority- 30+ days + low prioritycold_and_disconnected- 45+ days + isolated
Tier Transitions
Memories move between tiers during sleep cycles:
Hot → Warm
Trigger: Unused for 4+ days AND priority drops below 0.65
Warm → Archive
Trigger: Unused for 30+ days AND meets all archive criteria
Archive → Warm
Trigger: New access OR manual tag added
Warm → Hot
Trigger: Frequent access OR new relationships OR priority increase
Implementation Details
Fields updated:
memoryItems.tier- "hot" | "warm" | "archive"memoryChunks.layer- "hot" | "warm" | "archive" (synchronized)
Tier change tracking:
interface TierChange {
memoryId: string;
currentTier: "hot" | "warm" | "archive";
targetTier: "hot" | "warm" | "archive";
reason: string; // e.g., "stale_and_low_priority"
}
Automatic processes:
- Sleep agent collects tier change candidates
- Evaluates each memory against criteria
- Updates tier in database
- Optionally refreshes summaries for tier changes
Tagging System
Automatic and manual tagging provides flexible memory organization.
Auto-Tagging Process
Rule-Based Tags
Sources:
- Source type - "upload", "chat", "connector"
- File extensions - "pdf", "md", "docx"
- Connector providers - "google-drive", "notion"
- Filename keywords - Extracted from title
Example:
Input: {
source: "upload",
title: "Q4_Budget_Analysis.pdf",
metadata: { filename: "Q4_Budget_Analysis.pdf" }
}
Auto tags: ["upload", "pdf"]
Content-Based Tags
Keyword extraction:
- Summarize content (2 sentences)
- Split into tokens
- Filter tokens ≥ 4 characters
- Take top 6 keywords
- Normalize and deduplicate
Example:
Content: "This document outlines the quarterly budget analysis
for engineering and product teams..."
Extracted keywords: ["document", "quarterly", "budget", "analysis",
"engineering", "product"]
Tag Limits
- Maximum auto tags: 6 per memory
- Minimum keyword length: 4 characters
- Normalization: Lowercase, trim punctuation
- Deduplication: Case-insensitive unique
Manual Tagging
Users add custom tags through the UI:
Characteristics:
- Unlimited manual tags
- Never removed by system
- Persistent across re-tagging
- Prevent archival (memories with manual tags stay warm+)
UI indicators:
- Emerald badge color
- "Manual" label in tag list
- Editable/removable
Combined Tags Array
The system maintains three tag arrays:
{
autoTags: ["upload", "pdf", "budget"], // System-generated
manualTags: ["2025-planning", "important"], // User-added
tags: ["upload", "pdf", "budget", // Union of both
"2025-planning", "important"]
}
Used for:
- Search filtering
- Relationship detection
- Memory clustering
- Topology visualization
Tag Refresh Schedule
Auto tags regenerate when:
- Memory not tagged in 7+ days
- Sleep agent runs tagging step
- Limited to 20 memories per run (performance)
Process:
- Query memories where
lastTaggedAt < (now - 7 days) - Limit to 20 oldest
- Regenerate auto tags
- Merge with existing manual tags
- Update
tagsarray - Set
lastTaggedAt = now
Relationship Detection
The linking system builds a knowledge graph by detecting relationships between memories.
Detection Strategy
Tag-Based Linking
Memories sharing tags are candidates for relationships:
Memory A: tags = ["machine-learning", "python", "tensorflow"]
Memory B: tags = ["machine-learning", "neural-networks"]
Shared tags: ["machine-learning"]
→ Relationship proposed
Entity-Based Linking
Memories mentioning the same entities:
Memory A: inferredEntities = ["Alice", "Product Team", "Q4"]
Memory B: inferredEntities = ["Alice", "Engineering", "Q4"]
Shared entities: ["Alice", "Q4"]
→ Relationship proposed
Relationship Proposal
Each proposal includes:
{
fromMemoryId: "mem_abc123",
toMemoryId: "mem_def456",
relation: "related_to", // Relationship type
weight: 0.75, // Strength (0-1)
confidence: 0.8, // Certainty (0-1)
evidence: {
sharedTags: ["machine-learning"],
sharedEntities: ["Alice"],
contextChunk: "chunk_ghi789"
}
}
Relationship Types
Common relation types:
related_to- General connectionreferences- Citation/mentionpart_of- Hierarchicalsimilar_to- Semantic similarityprecedes/follows- Temporalauthored_by- Attribution
Edge Weight Calculation
weight = (
sharedTagCount * 0.3 +
sharedEntityCount * 0.4 +
semanticSimilarity * 0.3
)
// Normalized to 0-1 range
Stale Edge Cleanup
During sleep cycles, low-confidence edges are pruned:
// Remove edges where:
lastVerifiedAt < (now - 7 days) &&
confidence <= 0.35
This prevents graph clutter from outdated relationships.
Relationship Score
Each memory calculates a relationship score:
relationshipScore = totalEdgeCount + (recentEdgeCount * 2)
// Used for tier qualification:
relationshipScore >= 6 → hot tier candidate
relationshipScore >= 3 → warm tier candidate
Summaries and Previews
Summaries provide quick context for memories and chunks.
Summary Scopes
Chunk-Level Summaries
- Scope: Single chunk
- Length: 1-2 sentences
- Use: Quick preview in search results
Memory-Level Summaries
- Scope: Entire memory item
- Length: 2-3 sentences
- Use: Card previews, list views
Generation Process
Using @kybernesis/pipeline:
import { summarize } from '@kybernesis/pipeline';
const summary = summarize(content, {
maxSentences: 2, // Limit length
preserveContext: true // Keep key details
});
Algorithm:
- Split content by sentence boundaries (periods, exclamation points, question marks)
- Extract first N sentences (default: 3)
- Join sentences with spaces
- Store with metadata:
terminal
{ summary: "Q4 focuses on revenue growth. Engineering priorities include...", // Note: Current implementation uses extractive summarization // (first N sentences), not LLM-based summarization updatedAt: 1698765555000 }
Summary Refresh
Summaries regenerate when:
- Memory tier changes
- Content is updated
- User triggers re-summarization
- Sleep agent runs summarize step
Display in UI
Search results:
[Memory Title]
Summary: "Brief 1-2 sentence overview..."
Tags: [tag1] [tag2] [tag3]
Topology nodes:
┌──────────────────┐
│ Memory Title │
│ ──────────────── │
│ Summary text... │
└──────────────────┘
Decay Scoring
Decay measures how "stale" a memory has become over time.
Decay Formula
decayScore = calculateDecay({
lastAccessedAt: 1698000000000,
createdAt: 1690000000000,
accessCount: 5,
now: Date.now()
});
// Factors:
// - Time since last access (exponential)
// - Time since creation (linear)
// - Access frequency (inverse)
// Range: 0.0 (fresh) to 1.0 (stale)
Decay Thresholds
decayScore <= 0.25 → Hot tier candidate
decayScore >= 0.60 → Archive candidate (soft)
decayScore >= 0.80 → Archive candidate (hard)
Decay Adjustment
Access resets decay:
Before access: decayScore = 0.8 (very stale)
After access: decayScore = 0.1 (fresh)
Relationships slow decay:
Isolated memory: decayScore increases quickly
Connected memory: decayScore increases slowly
Priority Calculation
Priority represents memory importance.
Priority Factors
priority = weighted_average([
userExplicitPriority * 0.4, // Manual priority setting
accessFrequency * 0.2, // How often accessed
relationshipDensity * 0.2, // Knowledge graph centrality
recency * 0.1, // How recent
sourceImportance * 0.1 // Source type weight
]);
// Range: 0.0 (low) to 1.0 (high)
Priority Updates
Priority recalculates when:
- Memory is accessed
- Relationships change
- User sets manual priority
- Sleep agent runs tier step
Priority Thresholds
priority >= 0.65 → Hot tier candidate
priority >= 0.30 → Warm tier candidate
priority < 0.30 → Archive candidate
Troubleshooting
Issue: Memories Not Being Tagged
Symptoms:
autoTagsarray is empty- Tags not appearing in UI
Diagnosis:
- Check
lastTaggedAttimestamp - Verify sleep agent is running
- Check sleep run logs for tagging step
Solutions:
- Wait for next sleep cycle (every 60 minutes)
- Manually trigger sleep run
- Check content exists (empty content = no tags)
- Verify OpenAI API key is configured
Issue: All Memories Stuck in Hot Tier
Symptoms:
- No memories moving to warm/archive
- Tier distribution heavily skewed
Diagnosis:
- Check
lastAccessedAttimestamps - Review access patterns (frequent access?)
- Verify sleep agent tier step is running
Solutions:
- Recent access (< 3 days) keeps memories hot
- Wait for access window to expire
- Check for automated processes accessing memories
- Verify tier change criteria are appropriate
Issue: Relationships Not Detected
Symptoms:
- Knowledge graph is sparse
- No edges between memories
Diagnosis:
- Check if memories share tags
- Verify
lastRelationshipAuditAttimestamp - Review sleep run logs for linking step
Solutions:
- Add more tags to memories
- Wait for sleep cycle to run linking
- Ensure memories have overlapping entities
- Check confidence threshold configuration
Issue: Poor Search Results
Symptoms:
- Relevant memories not returned
- Low similarity scores
Diagnosis:
- Check if memories are chunked
- Verify embeddings exist in Chroma
- Test both vector and metadata search separately
Solutions:
- Re-ingest memories to regenerate embeddings
- Check embedding model version consistency
- Try different query phrasing
- Use tag filters to narrow results
- Verify Chroma collection is accessible
Issue: Summaries Missing
Symptoms:
- No preview text in search results
- Summary fields are null
Diagnosis:
- Check
memorySummariestable - Verify sleep agent summarize step runs
- Check OpenAI API quotas
Solutions:
- Trigger sleep cycle manually
- Verify API key has credits
- Check for errors in sleep run logs
- Re-summarize specific memories via API
Next Steps
- Retrieval Guide - Master hybrid search techniques
- UI Guide - Navigate the topology interface
- Core Concepts - Understand fundamental concepts