Core Concepts

This guide explains the fundamental concepts behind the Kybernesis memory platform.

Table of Contents


What is a Memory?

A memory is any piece of information you store in Kybernesis. It could be:

  • A document you uploaded
  • A message from a chat conversation
  • A file synced from Google Drive or Notion
  • Any text or content you want to preserve and search

Each memory has:

  • Title - A descriptive name for the memory
  • Content - The actual text or information
  • Tags - Keywords that categorize the memory
  • Source - Where the memory came from (upload, chat, or connector)
  • Priority - How important the system considers this memory (0 to 1)
  • Tier - Storage tier (hot, warm, or archive)
  • Timestamps - When it was created and last accessed

Think of memories as intelligent notes that understand their own importance and relationships to other information.


Memory Items vs Chunks

Kybernesis splits large memories into smaller pieces for efficient processing and retrieval:

Memory Item

The memory item is the complete, original piece of content. It contains:

  • All metadata (title, tags, source)
  • Overall priority and tier assignment
  • Links to all its chunks

Memory Chunks

Chunks are smaller segments of the original content, typically:

  • Up to 1200 characters each
  • 120 character overlap between chunks for continuity
  • Split at natural boundaries (paragraphs, sentences)
  • Each has its own vector embedding for semantic search
  • Inherit the parent memory's tier

Why chunking matters:

  • Large documents become searchable at a granular level
  • You can find specific sections without reading entire documents
  • Vector search works better on focused, coherent segments
  • Different chunks can be retrieved based on relevance

Example:

terminal
Memory Item: "Product Roadmap 2025.pdf" (15 pages)
  └── Chunk 1: "Q1 Objectives..." (first section)
  └── Chunk 2: "Engineering priorities..." (middle section)
  └── Chunk 3: "Budget considerations..." (final section)

When you search for "engineering priorities," only Chunk 2 might be returned, giving you precisely the relevant section.


Entities and Relationships

Kybernesis builds a knowledge graph from your memories by extracting entities and detecting relationships between them.

Entities

Entities are important concepts, people, places, or things mentioned in your memories:

  • Names (people, organizations, products)
  • Locations (cities, countries, addresses)
  • Concepts (topics, themes, categories)
  • Technical terms (APIs, frameworks, algorithms)

Each entity has:

  • Name - The entity identifier
  • Type - Category (person, organization, location, concept, etc.)
  • Salience - How prominent it is across your memories (0 to 1)

Relationships (Edges)

Relationships connect entities based on how they appear together in your memories:

  • Relation type - Describes the connection ("works_with", "located_in", "part_of")
  • Weight - Strength of the relationship (0 to 1)
  • Context - Which memory chunk contains this relationship
  • Confidence - How certain the system is about the connection

Example knowledge graph:

terminal
Entities:
- "Alice" (person, salience: 0.8)
- "Product Team" (organization, salience: 0.7)
- "San Francisco" (location, salience: 0.5)

Relationships:
- Alice --[works_with]--> Product Team (weight: 0.9)
- Product Team --[located_in]--> San Francisco (weight: 0.7)

The knowledge graph helps you discover:

  • How concepts relate to each other
  • Hidden connections between memories
  • Clusters of related information
  • Important recurring themes

Memory Tiers

Kybernesis automatically manages memory storage across three tiers based on usage patterns, similar to how your brain prioritizes information:

Hot Tier

Fast retrieval, actively used memories

A memory stays in the hot tier if it meets ANY of these conditions:

  • Priority ≥ 0.65 (high importance)
  • Decay score ≤ 0.25 (low decay)
  • Accessed within the last 3 days
  • Has 6+ relationships in the knowledge graph
  • Has 4+ recent active connections
  • Manually pinned by you

Use case: Current projects, frequently referenced documents, recent conversations

Warm Tier

Moderate retrieval, occasionally accessed memories

A memory moves to warm if it meets ANY of these conditions:

  • Priority ≥ 0.3 (moderate importance)
  • Accessed within the last 21 days
  • Has manual tags you've added
  • Has 3+ relationships in the knowledge graph

Use case: Archived projects, reference materials, seasonal information

Archive Tier

Slow retrieval, rarely accessed memories

A memory moves to archive if ALL of these are true:

  • Not accessed for 30+ days
  • Low priority (< 0.3)
  • High decay score (≥ 0.6-0.8)
  • Low connectivity (≤ 2 relationships, no recent edges)
  • No manual tags

Use case: Old documents, completed projects, historical data

Tier Transitions

Memories move between tiers automatically during sleep cycles (every 60 minutes):

  • Access a memory → may promote to hot
  • Memory gains connections → may promote to warm/hot
  • Memory unused for weeks → may demote to archive

You can also manually pin memories to keep them in the hot tier regardless of usage.


Tags: Auto vs Manual

Tags categorize and organize your memories. Kybernesis supports two types:

Auto Tags

System-generated tags created automatically during sleep cycles:

Generated from:

  • Source type (upload, chat, connector)
  • File extensions (.pdf, .md, .doc)
  • Provider names (google-drive, notion)
  • Content keywords (extracted from summaries)

Characteristics:

  • Created every 7 days if content hasn't been re-tagged
  • Maximum 6 auto tags per memory
  • Can be regenerated without losing manual tags
  • Confidence-scored by the tagging system

Example auto tags:

terminal
Memory: "Q4_Budget_Analysis.pdf"
Auto tags: ["upload", "pdf", "budget", "analysis", "financial", "quarterly"]

Manual Tags

User-created tags that you add yourself:

Use for:

  • Project names
  • Custom categories
  • Personal organizational schemes
  • Team-specific classifications

Characteristics:

  • Never removed by the system
  • Displayed with emerald-colored badges in the UI
  • Kept separate from auto tags in the database
  • Prevent a memory from being archived

Example manual tags:

terminal
Manual tags: ["2025-planning", "high-priority", "review-needed"]

Combined Tags

The system maintains a combined tags array that merges auto tags and manual tags. This combined list is used for:

  • Search and filtering
  • Relationship detection
  • Memory clustering in the topology graph

You can filter search results by any tag (auto or manual) to find related memories quickly.


The Sleep Agent

The Sleep Agent is Kybernesis's background processing system that maintains and enriches your memories automatically.

What It Does

Every 60 minutes, the Sleep Agent runs through four processing steps:

1. Tag (Auto-Tagging)

  • Analyzes memory content
  • Extracts semantic keywords
  • Generates up to 6 auto tags per memory
  • Only re-tags memories older than 7 days

2. Link (Relationship Detection)

  • Finds memories with shared tags/entities
  • Proposes relationship connections
  • Creates edges in the knowledge graph
  • Assigns confidence scores to relationships

3. Tier (Storage Management)

  • Evaluates access patterns
  • Calculates decay scores
  • Moves memories between hot/warm/archive tiers
  • Updates chunk storage layers

4. Summarize (Content Condensation)

  • Creates condensed representations
  • Generates quick preview text
  • Improves retrieval performance
  • Enables faster memory browsing

Sleep Runs

Each sleep cycle is tracked as a sleep run with:

  • Start/completion timestamps
  • Number of tasks completed/failed
  • Performance metrics per step
  • Error logs if issues occur

You can view sleep run history to understand how your memory collection is being maintained.

When Sleep Runs

  • Automatic: Every 60 minutes via scheduler
  • Manual: You can trigger a sleep cycle on demand
  • Checkpointed: Each step is saved, so interrupted runs can resume

Benefits of the Sleep Agent

  • Zero manual maintenance - Automatic organization
  • Improved discovery - Better tags and relationships
  • Optimized performance - Automatic tier management
  • Always up-to-date - Regular re-analysis of memories

The Sleep Agent works while you're away, ensuring your memory collection stays organized, connected, and optimized.


Hybrid Retrieval

Hybrid retrieval combines two search strategies to find the most relevant memories:

Vector Search (Semantic Similarity)

Finds memories based on meaning, not just keywords:

  • Converts your query into a numeric vector (embedding)
  • Compares against all memory chunk embeddings
  • Returns chunks with similar semantic meaning
  • Works even if exact words don't match

Example:

terminal
Query: "How do I increase revenue?"
Matches: Memories about "sales growth", "profit optimization", "monetization strategies"

Metadata Filtering (Structured Search)

Finds memories based on attributes:

  • Title matching
  • Tag filtering
  • Source type
  • Priority/tier
  • Date ranges

Example:

terminal
Filter: tags=["2025-planning"] AND source="upload"
Returns: All uploaded documents tagged with "2025-planning"

How They Combine

The hybrid search system:

  1. Runs both searches in parallel (vector + metadata)
  2. Normalizes scores to a 0-1 range
  3. Combines scores with weighted formula:
    • Vector score × 0.7 (70% weight)
    • Metadata score × 0.3 (30% weight)
  4. Ranks results by combined hybrid score
  5. Returns top matches up to your specified limit

Understanding Scores

Each result includes three scores:

  • hybridScore - Overall relevance (0 to 1)
  • vectorScore - Semantic similarity (0 to 1)
  • metadataScore - Metadata match quality (0 to 1)

Higher scores = more relevant results.

Hybrid retrieval gives you the best of both worlds: the intelligence of semantic search with the precision of filtered queries.


Memory Sources

Memories can originate from three different sources:

1. Upload

Direct file uploads from your computer:

Supported formats:

  • Text documents (.txt, .md, .doc, .docx)
  • PDFs (.pdf)
  • Code files (.js, .py, .java, etc.)
  • Other text-based formats

Features:

  • Drag-and-drop interface
  • Fast processing for all file sizes
  • Preserves filename and metadata
  • Automatic content extraction

2. Chat

Messages from conversations with the system:

Characteristics:

  • Each message becomes a memory
  • Preserves conversational context
  • Timestamped for chronological tracking
  • Includes user ID and session metadata

Use case: Capture insights, decisions, and notes from interactive sessions

3. Connector

Synced content from external platforms via OAuth:

Supported connectors:

  • Google Drive - Documents, sheets, presentations
  • Notion - Pages, databases, workspaces

How it works:

  1. Authenticate with OAuth (one-time setup)
  2. Connector syncs files automatically
  3. Updates tracked with cursor-based pagination
  4. Metadata includes connector type and source reference

Sync frequency:

  • Triggered by scheduler (every 60 minutes)
  • Manual sync available on demand
  • Incremental updates (only new/changed items)

Benefits:

  • Automatically imports your existing knowledge
  • Keeps memories in sync with source platforms
  • Centralized search across all your information
  • No manual copying required

Each memory source is tagged automatically, so you can filter by source type when searching (e.g., "show me all uploaded PDFs" or "find connector-synced documents").


Summary

Kybernesis organizes your information into memories that are:

  • Chunked for granular search and retrieval
  • Connected through an entity knowledge graph
  • Tiered automatically based on usage patterns
  • Tagged with both auto-generated and manual keywords
  • Maintained by the Sleep Agent background processor
  • Searchable via hybrid vector + metadata retrieval
  • Sourced from uploads, chats, and OAuth connectors

Understanding these concepts helps you get the most out of Kybernesis as your unified memory platform.


Next Steps