Core Concepts

This guide explains the fundamental concepts behind the Kybernesis memory platform.

What is a Memory?
Memory Items vs Chunks
Entities and Relationships
Memory Tiers
Tags: Auto vs Manual
The Sleep Agent
Hybrid Retrieval
Memory Sources

What is a Memory?

A memory is any piece of information you store in Kybernesis. It could be:

A document you uploaded
A message from a chat conversation
A file synced from Google Drive or Notion
Any text or content you want to preserve and search

Each memory has:

Title - A descriptive name for the memory
Content - The actual text or information
Tags - Keywords that categorize the memory
Source - Where the memory came from (upload, chat, or connector)
Priority - How important the system considers this memory (0 to 1)
Tier - Storage tier (hot, warm, or archive)
Timestamps - When it was created and last accessed

Think of memories as intelligent notes that understand their own importance and relationships to other information.

Memory Items vs Chunks

Kybernesis splits large memories into smaller pieces for efficient processing and retrieval:

Memory Item

The memory item is the complete, original piece of content. It contains:

All metadata (title, tags, source)
Overall priority and tier assignment
Links to all its chunks

Memory Chunks

Chunks are smaller segments of the original content, typically:

Up to 1200 characters each
120 character overlap between chunks for continuity
Split at natural boundaries (paragraphs, sentences)
Each has its own vector embedding for semantic search
Inherit the parent memory's tier

Why chunking matters:

Large documents become searchable at a granular level
You can find specific sections without reading entire documents
Vector search works better on focused, coherent segments
Different chunks can be retrieved based on relevance

Example:

terminal

Memory Item: "Product Roadmap 2025.pdf" (15 pages)
  └── Chunk 1: "Q1 Objectives..." (first section)
  └── Chunk 2: "Engineering priorities..." (middle section)
  └── Chunk 3: "Budget considerations..." (final section)

When you search for "engineering priorities," only Chunk 2 might be returned, giving you precisely the relevant section.

Entities and Relationships

Kybernesis builds a knowledge graph from your memories by extracting entities and detecting relationships between them.

Entities

Entities are important concepts, people, places, or things mentioned in your memories:

Names (people, organizations, products)
Locations (cities, countries, addresses)
Concepts (topics, themes, categories)
Technical terms (APIs, frameworks, algorithms)

Each entity has:

Name - The entity identifier
Type - Category (person, organization, location, concept, etc.)
Salience - How prominent it is across your memories (0 to 1)

Relationships (Edges)

Relationships connect entities based on how they appear together in your memories:

Relation type - Describes the connection ("works_with", "located_in", "part_of")
Weight - Strength of the relationship (0 to 1)
Context - Which memory chunk contains this relationship
Confidence - How certain the system is about the connection

Example knowledge graph:

terminal

Entities:
- "Alice" (person, salience: 0.8)
- "Product Team" (organization, salience: 0.7)
- "San Francisco" (location, salience: 0.5)

Relationships:
- Alice --[works_with]--> Product Team (weight: 0.9)
- Product Team --[located_in]--> San Francisco (weight: 0.7)

The knowledge graph helps you discover:

How concepts relate to each other
Hidden connections between memories
Clusters of related information
Important recurring themes

Memory Tiers

Kybernesis automatically manages memory storage across three tiers based on usage patterns, similar to how your brain prioritizes information:

Hot Tier

Fast retrieval, actively used memories

A memory stays in the hot tier if it meets ANY of these conditions:

Priority ≥ 0.65 (high importance)
Decay score ≤ 0.25 (low decay)
Accessed within the last 3 days
Has 6+ relationships in the knowledge graph
Has 4+ recent active connections
Manually pinned by you

Use case: Current projects, frequently referenced documents, recent conversations

Warm Tier

Moderate retrieval, occasionally accessed memories

A memory moves to warm if it meets ANY of these conditions:

Priority ≥ 0.3 (moderate importance)
Accessed within the last 21 days
Has manual tags you've added
Has 3+ relationships in the knowledge graph

Use case: Archived projects, reference materials, seasonal information

Archive Tier

Slow retrieval, rarely accessed memories

A memory moves to archive if ALL of these are true:

Not accessed for 30+ days
Low priority (< 0.3)
High decay score (≥ 0.6-0.8)
Low connectivity (≤ 2 relationships, no recent edges)
No manual tags

Use case: Old documents, completed projects, historical data

Tier Transitions

Memories move between tiers automatically during sleep cycles (every 60 minutes):

Access a memory → may promote to hot
Memory gains connections → may promote to warm/hot
Memory unused for weeks → may demote to archive

You can also manually pin memories to keep them in the hot tier regardless of usage.

Tags: Auto vs Manual

Tags categorize and organize your memories. Kybernesis supports two types:

Auto Tags

System-generated tags created automatically during sleep cycles:

Generated from:

Source type (upload, chat, connector)
File extensions (.pdf, .md, .doc)
Provider names (google-drive, notion)
Content keywords (extracted from summaries)

Characteristics:

Created every 7 days if content hasn't been re-tagged
Maximum 6 auto tags per memory
Can be regenerated without losing manual tags
Confidence-scored by the tagging system

Example auto tags:

terminal

Memory: "Q4_Budget_Analysis.pdf"
Auto tags: ["upload", "pdf", "budget", "analysis", "financial", "quarterly"]

Manual Tags

User-created tags that you add yourself:

Use for:

Project names
Custom categories
Personal organizational schemes
Team-specific classifications

Characteristics:

Never removed by the system
Displayed with emerald-colored badges in the UI
Kept separate from auto tags in the database
Prevent a memory from being archived

Example manual tags:

terminal

Manual tags: ["2025-planning", "high-priority", "review-needed"]

Combined Tags

The system maintains a combined tags array that merges auto tags and manual tags. This combined list is used for:

Search and filtering
Relationship detection
Memory clustering in the topology graph

You can filter search results by any tag (auto or manual) to find related memories quickly.

The Sleep Agent

The Sleep Agent is Kybernesis's background processing system that maintains and enriches your memories automatically.

What It Does

Every 60 minutes, the Sleep Agent runs through four processing steps:

1. Tag (Auto-Tagging)

Analyzes memory content
Extracts semantic keywords
Generates up to 6 auto tags per memory
Only re-tags memories older than 7 days

2. Link (Relationship Detection)

Finds memories with shared tags/entities
Proposes relationship connections
Creates edges in the knowledge graph
Assigns confidence scores to relationships

3. Tier (Storage Management)

Evaluates access patterns
Calculates decay scores
Moves memories between hot/warm/archive tiers
Updates chunk storage layers

4. Summarize (Content Condensation)

Creates condensed representations
Generates quick preview text
Improves retrieval performance
Enables faster memory browsing

Sleep Runs

Each sleep cycle is tracked as a sleep run with:

Start/completion timestamps
Number of tasks completed/failed
Performance metrics per step
Error logs if issues occur

You can view sleep run history to understand how your memory collection is being maintained.

When Sleep Runs

Automatic: Every 60 minutes via scheduler
Manual: You can trigger a sleep cycle on demand
Checkpointed: Each step is saved, so interrupted runs can resume

Benefits of the Sleep Agent

Zero manual maintenance - Automatic organization
Improved discovery - Better tags and relationships
Optimized performance - Automatic tier management
Always up-to-date - Regular re-analysis of memories

The Sleep Agent works while you're away, ensuring your memory collection stays organized, connected, and optimized.

Hybrid Retrieval

Hybrid retrieval combines two search strategies to find the most relevant memories:

Vector Search (Semantic Similarity)

Finds memories based on meaning, not just keywords:

Converts your query into a numeric vector (embedding)
Compares against all memory chunk embeddings
Returns chunks with similar semantic meaning
Works even if exact words don't match

Example:

terminal

Query: "How do I increase revenue?"
Matches: Memories about "sales growth", "profit optimization", "monetization strategies"

Metadata Filtering (Structured Search)

Finds memories based on attributes:

Title matching
Tag filtering
Source type
Priority/tier
Date ranges

Example:

terminal

Filter: tags=["2025-planning"] AND source="upload"
Returns: All uploaded documents tagged with "2025-planning"

How They Combine

The hybrid search system:

Runs both searches in parallel (vector + metadata)
Normalizes scores to a 0-1 range
Combines scores with weighted formula:
- Vector score × 0.7 (70% weight)
- Metadata score × 0.3 (30% weight)
Ranks results by combined hybrid score
Returns top matches up to your specified limit

Understanding Scores

Each result includes three scores:

hybridScore - Overall relevance (0 to 1)
vectorScore - Semantic similarity (0 to 1)
metadataScore - Metadata match quality (0 to 1)

Higher scores = more relevant results.

Hybrid retrieval gives you the best of both worlds: the intelligence of semantic search with the precision of filtered queries.

Memory Sources

Memories can originate from three different sources:

1. Upload

Direct file uploads from your computer:

Supported formats:

Text documents (.txt, .md, .doc, .docx)
PDFs (.pdf)
Code files (.js, .py, .java, etc.)
Other text-based formats

Features:

Drag-and-drop interface
Fast processing for all file sizes
Preserves filename and metadata
Automatic content extraction

2. Chat

Messages from conversations with the system:

Characteristics:

Each message becomes a memory
Preserves conversational context
Timestamped for chronological tracking
Includes user ID and session metadata

Use case: Capture insights, decisions, and notes from interactive sessions

3. Connector

Synced content from external platforms via OAuth:

Supported connectors:

Google Drive - Documents, sheets, presentations
Notion - Pages, databases, workspaces

How it works:

Authenticate with OAuth (one-time setup)
Connector syncs files automatically
Updates tracked with cursor-based pagination
Metadata includes connector type and source reference

Sync frequency:

Triggered by scheduler (every 60 minutes)
Manual sync available on demand
Incremental updates (only new/changed items)

Benefits:

Automatically imports your existing knowledge
Keeps memories in sync with source platforms
Centralized search across all your information
No manual copying required

Each memory source is tagged automatically, so you can filter by source type when searching (e.g., "show me all uploaded PDFs" or "find connector-synced documents").

Summary

Kybernesis organizes your information into memories that are:

Chunked for granular search and retrieval
Connected through an entity knowledge graph
Tiered automatically based on usage patterns
Tagged with both auto-generated and manual keywords
Maintained by the Sleep Agent background processor
Searchable via hybrid vector + metadata retrieval
Sourced from uploads, chats, and OAuth connectors

Understanding these concepts helps you get the most out of Kybernesis as your unified memory platform.

Next Steps

Memory System Deep Dive - Learn how memory storage and processing works
Retrieval Guide - Master hybrid search and query syntax
UI Guide - Navigate the topology interface
Connectors Setup - Connect Google Drive and Notion

Core Concepts

Table of Contents

What is a Memory?

Memory Items vs Chunks

Memory Item

Memory Chunks

Entities and Relationships

Entities

Relationships (Edges)

Memory Tiers

Hot Tier

Warm Tier

Archive Tier

Tier Transitions

Tags: Auto vs Manual

Auto Tags

Manual Tags

Combined Tags

The Sleep Agent

What It Does

1. Tag (Auto-Tagging)

2. Link (Relationship Detection)

3. Tier (Storage Management)

4. Summarize (Content Condensation)

Sleep Runs

When Sleep Runs

Benefits of the Sleep Agent

Hybrid Retrieval

Vector Search (Semantic Similarity)

Metadata Filtering (Structured Search)

How They Combine

Understanding Scores

Memory Sources

1. Upload

2. Chat

3. Connector

Summary

Next Steps