Retrieval Guide

Learn how to search and retrieve memories using Kybernesis's hybrid search system.

Table of Contents


How Hybrid Search Works

Kybernesis uses hybrid retrieval that combines three search methods plus cognitive enrichment:

  1. Vector Search (55% weight) - Semantic similarity using AI embeddings
  2. Metadata Filtering (25% weight) - Structured search on attributes
  3. Keyword Overlap (20% weight) - Lexical term matching

The Hybrid Process

terminal
User Query: "What does Ian work on?"
     │
     ├─→ Vector Search (55%)
     │     ├─ Convert query to embedding
     │     ├─ Compare to all memory chunks
     │     └─ Find semantic matches
     │
     ├─→ Metadata Search (25%)
     │     ├─ Search titles
     │     ├─ Filter by tags
     │     └─ Match attributes
     │
     ├─→ Keyword Search (20%)
     │     └─ Literal term overlap in content
     │
     └─→ Combine, Boost & Enrich
           ├─ Normalize and weight scores
           ├─ Apply fact-aware boosts (confidence-weighted)
           ├─ Apply graph neighbor boost (+0.08)
           ├─ Apply surprisal boost (+0.03 for novel facts)
           ├─ Fetch entity profiles (with narratives)
           ├─ Fetch reasoning insights (deductions/inductions)
           ├─ Fetch pending contradictions
           └─ Return enriched results

Cognitive Context in Results

Beyond ranked memories, each retrieval response includes:

  • Entity Profiles: Structured facts and narrative summaries for detected entities
  • Insights: Reasoning-derived deductions and inductions (confidence >= 0.7)
  • Pending Contradictions: Conflicting facts that need human resolution
  • Reasoning Traces: Per-result explanations of why each memory ranked where it did

Why Hybrid?

Vector search alone:

  • Great for semantic understanding
  • Finds related concepts
  • Can miss exact keywords
  • May return too broad results

Metadata search alone:

  • Precise keyword matching
  • Fast structured filtering
  • Misses semantic relationships
  • Requires exact terms

Hybrid search:

  • Best of both worlds
  • Balances precision and recall
  • More robust to query variations
  • Better overall relevance

Vector Search

Vector search finds memories based on semantic meaning, not just keyword matching.

How It Works

1. Query Embedding

Your query is converted to a 1536-dimensional vector:

terminal
Query: "How to optimize database performance"
        ↓
Embedding: [0.23, -0.15, 0.78, ..., 0.42]

2. Similarity Calculation

The system compares your query vector against all memory chunk vectors:

terminal
Distance = Euclidean distance between vectors
Similarity = 1 / (1 + distance)

Lower distance = Higher similarity = Better match

3. Ranking

Chunks are ranked by similarity score:

terminal
Chunk A: similarity = 0.92 (excellent match)
Chunk B: similarity = 0.78 (good match)
Chunk C: similarity = 0.45 (weak match)

Semantic Matching Examples

Vector search understands related concepts:

terminal
Query: "increase revenue"
Matches:
  - "boost sales figures"
  - "improve profit margins"
  - "grow quarterly earnings"
  - "enhance monetization"
terminal
Query: "project planning"
Matches:
  - "roadmap development"
  - "milestone scheduling"
  - "sprint organization"
  - "delivery timelines"

Vector Search Parameters

Configurable via API:

terminal
{
  query: "your search query",
  limit: 10,              // Number of results
  orgId: "org_123"        // Organization filter
}

Behind the scenes:

  • Collection: "kybernesis_memories"
  • Distance metric: Euclidean
  • Embedding model: text-embedding-3-small
  • Dimensions: 1536

Performance Characteristics

  • Speed: ~50-200ms for queries
  • Capacity: Millions of vectors
  • Accuracy: High for semantic queries
  • Cache: 30-second TTL for repeated queries

Metadata Filtering

Metadata search finds memories based on structured attributes.

Searchable Fields

Title Matching

Search memory titles for keywords:

terminal
Query: "budget"
Matches titles containing:
  - "Q4 Budget Analysis"
  - "2025 Budget Planning"
  - "Engineering Budget Review"

Tag Filtering

Filter by exact tag matches:

terminal
Tags: ["2025-planning", "high-priority"]
Returns only memories with BOTH tags

Source Type

Filter by memory origin:

terminal
Source: "upload"
Returns only uploaded files

Source: "connector"
Returns only synced content

Priority Range

Filter by importance score:

terminal
Priority >= 0.7
Returns high-priority memories only

Tier Filtering

Filter by storage tier:

terminal
Tier: "hot"
Returns only hot-tier memories

Date Ranges

Filter by creation or access time:

terminal
CreatedAt >= "2024-01-01"
CreatedAt <= "2024-12-31"
Returns memories from 2024

Metadata Score Calculation

terminal
metadataScore = weighted_average([
  titleMatch * 0.4,        // Title keyword presence
  tagMatch * 0.3,          // Tag overlap
  priorityMatch * 0.2,     // Priority alignment
  recencyBoost * 0.1       // Recent creation/access
]);

// Range: 0.0 to 1.0

Boolean Logic

Metadata filters combine with AND logic:

terminal
Tags: ["ai", "python"]
+ Source: "upload"
+ Priority >= 0.5

→ Memories matching ALL conditions

Query Syntax

Basic Queries

Simple text search:

terminal
how to deploy kubernetes

Returns memories semantically related to Kubernetes deployment.

Multi-word phrases:

terminal
"machine learning deployment"

Searches for the exact phrase (in vector space).

Questions:

terminal
What are the benefits of microservices?

Natural language questions work well with semantic search.

Query Best Practices

✅ Do

Use natural language:

terminal
How can I improve API response times?

Include context:

terminal
React performance optimization techniques for large lists

Be specific:

terminal
PostgreSQL query optimization for JOIN operations

Use domain terms:

terminal
OAuth2 authorization code flow implementation

❌ Avoid

Single keywords:

terminal
kubernetes

Too broad, use phrases instead.

Boolean operators:

terminal
docker AND kubernetes OR containers

Not supported, use natural phrases.

Wildcards:

terminal
deploy*

Not needed, semantic search handles variations.

Query Length Guidelines

LengthEffectivenessUse Case
1-2 wordsPoorToo vague
3-5 wordsGoodSpecific topics
6-15 wordsExcellentDetailed questions
16+ wordsModerateMay be overly specific

Optimal:

terminal
best practices for RESTful API design

Filtering by Tags

Tags provide powerful filtering capabilities.

Tag Filter Syntax

API request:

terminal
POST /retrieval/hybrid
{
  "query": "deployment strategies",
  "tags": ["kubernetes", "production"],
  "limit": 10
}

cURL example:

terminal
curl -X POST https://api.kybernesis.com/retrieval/hybrid \
  -H "Content-Type: application/json" \
  -d '{
    "query": "deployment strategies",
    "tags": ["kubernetes", "production"],
    "orgId": "org_123",
    "limit": 10
  }'

Tag Matching Behavior

All tags required (AND logic):

terminal
tags: ["ai", "python", "tensorflow"]

Matches:
  Memory A: tags = ["ai", "python", "tensorflow", "keras"]  ✓
  Memory B: tags = ["ai", "python"]                         ✗
  Memory C: tags = ["ai", "machine-learning"]               ✗

Common Tag Patterns

By Project

terminal
tags: ["2025-planning", "product-roadmap"]

By Technology

terminal
tags: ["react", "typescript", "frontend"]

By Status

terminal
tags: ["high-priority", "in-progress"]

By Team

terminal
tags: ["engineering", "backend", "infrastructure"]

Combining Tags with Search

Narrow semantic search with tags:

terminal
{
  query: "database optimization techniques",
  tags: ["postgresql", "production"],
  limit: 10
}

This returns:

  • Memories semantically about database optimization
  • AND tagged with both "postgresql" AND "production"

Tag Discovery

Finding available tags:

terminal
GET /api/tags?orgId=org_123

Returns list of all tags used in your organization.


Understanding Scores

Each search result includes multiple scores to help you understand relevance.

Score Types

Hybrid Score

Overall relevance combining vector + metadata:

terminal
hybridScore = (vectorScore * 0.7) + (metadataScore * 0.3)

Range: 0.0 to 1.0
Higher = More relevant

Interpretation:

  • 0.8 - 1.0: Excellent match (exact or highly relevant)
  • 0.6 - 0.8: Good match (relevant content)
  • 0.4 - 0.6: Moderate match (somewhat related)
  • 0.2 - 0.4: Weak match (tangentially related)
  • 0.0 - 0.2: Poor match (barely relevant)

Vector Score

Semantic similarity only:

terminal
vectorScore = 1 / (1 + euclideanDistance)

Range: 0.0 to 1.0

Interpretation:

  • 0.9 - 1.0: Semantically identical
  • 0.7 - 0.9: Semantically very similar
  • 0.5 - 0.7: Semantically related
  • < 0.5: Semantically distant

Metadata Score

Structured attribute matching:

terminal
metadataScore = weighted combination of:
  - Title match
  - Tag overlap
  - Priority alignment
  - Recency

Range: 0.0 to 1.0

Interpretation:

  • 1.0: Perfect metadata match (all filters satisfied)
  • 0.5 - 1.0: Partial match (some filters satisfied)
  • < 0.5: Weak metadata match

Score Normalization

Scores are normalized to ensure fair comparison:

terminal
// Vector scores normalized by maximum
maxVector = 0.95
score1 = 0.85 → normalized = 0.85 / 0.95 = 0.89

// Metadata scores normalized by maximum
maxMetadata = 0.80
score2 = 0.60 → normalized = 0.60 / 0.80 = 0.75

Example Result

terminal
{
  "memoryId": "mem_abc123",
  "hybridScore": 0.87,      // Overall: excellent match
  "vectorScore": 0.92,      // Semantic: very similar
  "metadataScore": 0.75,    // Metadata: good match
  "memory": {
    "title": "Kubernetes Deployment Guide",
    "tags": ["kubernetes", "devops", "production"]
  },
  "chunks": [
    {
      "chunkId": "chunk_def456",
      "similarity": 0.92,
      "document": "To deploy to Kubernetes..."
    }
  ]
}

Performance Tips

Optimize Query Speed

1. Limit Results Appropriately

terminal
// Too many results = slower
{ limit: 50 }  // ❌ Slow

// Optimal for most use cases
{ limit: 10 }  // ✓ Fast

2. Use Tag Filters

terminal
// Broad search = slower
{ query: "deployment" }  // ❌ Searches everything

// Filtered search = faster
{ query: "deployment", tags: ["kubernetes"] }  // ✓ Focused

3. Cache Repeated Queries

The system automatically caches for 30 seconds:

terminal
Query 1: "deployment strategies" → 150ms
Query 2: "deployment strategies" → 5ms (cached)

4. Avoid Overly Generic Queries

terminal
// Too generic = many results to rank
{ query: "data" }  // ❌ Slow

// Specific = fewer results to rank
{ query: "database migration strategies" }  // ✓ Fast

Improve Result Quality

1. Be Specific

terminal
❌ "kubernetes"
✓ "kubernetes deployment best practices for production"

2. Use Multi-Word Queries

terminal
❌ "api"
✓ "RESTful API design patterns"

3. Include Context

terminal
❌ "optimize performance"
✓ "optimize React component rendering performance"

4. Combine with Tags

terminal
{
  query: "deployment strategies",
  tags: ["kubernetes", "production"]  // Narrows results
}

Batch Queries

For multiple searches, send requests in parallel:

terminal
// Sequential (slow)
const result1 = await search("query1");
const result2 = await search("query2");
const result3 = await search("query3");

// Parallel (fast)
const [result1, result2, result3] = await Promise.all([
  search("query1"),
  search("query2"),
  search("query3")
]);

Performance Benchmarks

OperationLatency (p50)Latency (p95)
Vector search50ms150ms
Metadata search20ms80ms
Hybrid search (uncached)100ms250ms
Hybrid search (cached)5ms10ms

API Examples

Basic Search

HTTP Request:

terminal
curl -X POST https://api.kybernesis.com/retrieval/hybrid \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "orgId": "org_123",
    "query": "how to optimize database queries",
    "limit": 10
  }'

Response:

terminal
{
  "status": "ok",
  "orgId": "org_123",
  "query": "how to optimize database queries",
  "limit": 10,
  "results": [
    {
      "memoryId": "mem_abc123",
      "hybridScore": 0.87,
      "vectorScore": 0.92,
      "metadataScore": 0.75,
      "memory": {
        "id": "mem_abc123",
        "title": "Database Optimization Guide",
        "summary": "Comprehensive guide to optimizing SQL queries...",
        "tags": ["database", "postgresql", "optimization"],
        "source": "upload",
        "priority": 0.8,
        "tier": "hot",
        "createdAtIso": "2024-10-15T10:30:00Z"
      },
      "chunks": [
        {
          "chunkId": "chunk_def456",
          "similarity": 0.92,
          "document": "To optimize database queries, consider using indexes..."
        }
      ]
    }
  ],
  "diagnostics": {
    "vector": [
      {
        "chunkId": "chunk_def456",
        "distance": 0.087,
        "similarity": 0.92
      }
    ],
    "metadata": [
      {
        "memoryId": "mem_abc123",
        "score": 0.75
      }
    ]
  }
}

Search with Tag Filters

terminal
curl -X POST https://api.kybernesis.com/retrieval/hybrid \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "orgId": "org_123",
    "query": "deployment strategies",
    "tags": ["kubernetes", "production"],
    "limit": 10
  }'

Search with Summaries

terminal
curl -X POST https://api.kybernesis.com/retrieval/hybrid \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "orgId": "org_123",
    "query": "API authentication methods",
    "includeSummaries": true,
    "limit": 5
  }'

TypeScript SDK Example

terminal
import { KybernesisClient } from '@kybernesis/sdk';

const client = new KybernesisClient({
  apiKey: process.env.KYBERNESIS_API_KEY,
  orgId: 'org_123'
});

// Basic search
const results = await client.search({
  query: "machine learning deployment",
  limit: 10
});

// Search with filters
const filtered = await client.search({
  query: "API design patterns",
  tags: ["rest", "graphql"],
  limit: 15
});

// Process results
for (const result of results.results) {
  console.log(`[${result.hybridScore.toFixed(2)}] ${result.memory.title}`);
  console.log(`  Tags: ${result.memory.tags.join(', ')}`);
  console.log(`  Summary: ${result.memory.summary}`);
}

JavaScript Fetch Example

terminal
async function searchMemories(query, tags = []) {
  const response = await fetch('https://api.kybernesis.com/retrieval/hybrid', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${YOUR_API_KEY}`
    },
    body: JSON.stringify({
      orgId: 'org_123',
      query,
      tags,
      limit: 10,
      includeSummaries: true
    })
  });

  const data = await response.json();
  return data.results;
}

// Usage
const results = await searchMemories(
  "kubernetes deployment",
  ["production", "devops"]
);

Python Example

terminal
import requests

def search_memories(query, tags=None, limit=10):
    url = "https://api.kybernesis.com/retrieval/hybrid"
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {YOUR_API_KEY}"
    }
    payload = {
        "orgId": "org_123",
        "query": query,
        "tags": tags or [],
        "limit": limit,
        "includeSummaries": True
    }

    response = requests.post(url, json=payload, headers=headers)
    response.raise_for_status()
    return response.json()["results"]

# Usage
results = search_memories(
    query="database optimization techniques",
    tags=["postgresql", "performance"],
    limit=15
)

for result in results:
    print(f"[{result['hybridScore']:.2f}] {result['memory']['title']}")
    print(f"  Tags: {', '.join(result['memory']['tags'])}")

Advanced Techniques

Query Refinement

Start broad, then narrow:

terminal
Query 1: "deployment"
→ Too many results

Query 2: "kubernetes deployment"
→ Better, still broad

Query 3: "kubernetes deployment strategies for microservices"
→ Focused, high-quality results

Multi-Stage Retrieval

Retrieve, analyze, then refine:

terminal
// Stage 1: Broad search
const initial = await search({
  query: "API design",
  limit: 20
});

// Stage 2: Analyze tags from results
const commonTags = extractCommonTags(initial.results);
// → ["rest", "graphql", "authentication"]

// Stage 3: Refined search
const refined = await search({
  query: "API design authentication patterns",
  tags: commonTags.slice(0, 2),
  limit: 10
});

Re-Ranking Results

Custom scoring for specific use cases:

terminal
const results = await search({ query: "..." });

// Re-rank by recency
const reranked = results.sort((a, b) =>
  new Date(b.memory.createdAtIso).getTime() -
  new Date(a.memory.createdAtIso).getTime()
);

// Re-rank by priority
const byPriority = results.sort((a, b) =>
  b.memory.priority - a.memory.priority
);

Combining Multiple Queries

Search for multiple concepts:

terminal
const [backend, frontend, devops] = await Promise.all([
  search({ query: "backend optimization", tags: ["api"] }),
  search({ query: "frontend performance", tags: ["react"] }),
  search({ query: "deployment automation", tags: ["kubernetes"] })
]);

// Merge and deduplicate
const combined = mergeResults([backend, frontend, devops]);

Troubleshooting

Issue: No Results Returned

Possible causes:

  1. Query too specific
  2. No memories match filters
  3. Embeddings not generated

Solutions:

Broaden your query:

terminal
❌ "kubernetes helm chart v3 deployment to AWS EKS"
✓ "kubernetes deployment strategies"

Remove tag filters:

terminal
// Try without tags first
{ query: "deployment" }

// Then add tags back incrementally
{ query: "deployment", tags: ["kubernetes"] }

Check if memories exist:

terminal
curl https://api.kybernesis.com/api/memories?limit=10

Issue: Irrelevant Results

Possible causes:

  1. Query too vague
  2. Low-quality memory content
  3. Missing embeddings

Solutions:

Add more context to query:

terminal
❌ "performance"
✓ "database query performance optimization"

Use tag filters:

terminal
{
  query: "optimization",
  tags: ["database", "postgresql"]  // Focus results
}

Check result scores:

terminal
results.forEach(r => {
  if (r.hybridScore < 0.5) {
    console.warn("Low relevance:", r.memory.title);
  }
});

Issue: Slow Queries

Possible causes:

  1. Too many results requested
  2. First query (not cached)
  3. System load

Solutions:

Reduce limit:

terminal
{ query: "...", limit: 10 }  // Instead of 50

Use tag filters:

terminal
{ query: "...", tags: ["specific-tag"] }  // Narrows search space

Monitor latency:

terminal
const start = Date.now();
const results = await search({ query: "..." });
console.log(`Latency: ${Date.now() - start}ms`);

Issue: Duplicate Results

Possible causes:

  1. Same memory with multiple matching chunks
  2. Similar memories with slight variations

Solutions:

Deduplicate by memoryId:

terminal
const unique = Array.from(
  new Map(results.map(r => [r.memoryId, r])).values()
);

Deduplicate by similarity:

terminal
const deduped = results.filter((result, index) => {
  return !results.slice(0, index).some(prev =>
    calculateSimilarity(result, prev) > 0.95
  );
});

Issue: Missing Expected Results

Possible causes:

  1. Memory in archive tier
  2. Memory not yet indexed
  3. Embedding version mismatch

Solutions:

Check memory status:

terminal
curl https://api.kybernesis.com/api/memories/{memoryId}

Verify tier:

terminal
// Check if memory is archived
if (memory.tier === "archive") {
  // May not appear in search
}

Re-index memory:

terminal
curl -X POST https://api.kybernesis.com/api/memories/{memoryId}/reindex

Next Steps

  • UI Guide - Learn to search using the topology interface
  • Core Concepts - Understand how retrieval fits into the system
  • Memory System - Dive deeper into embeddings and chunking