Retrieval Guide

Learn how to search and retrieve memories using Kybernesis's hybrid search system.

Table of Contents


How Hybrid Search Works

Kybernesis uses hybrid retrieval that combines two search methods for optimal results:

  1. Vector Search - Semantic similarity using AI embeddings
  2. Metadata Filtering - Structured search on attributes

The Hybrid Process

terminal
User Query: "machine learning deployment strategies"
     │
     ├─→ Vector Search (70% weight)
     │     ├─ Convert query to embedding
     │     ├─ Compare to all memory chunks
     │     └─ Find semantic matches
     │
     ├─→ Metadata Search (30% weight)
     │     ├─ Search titles
     │     ├─ Filter by tags
     │     └─ Match attributes
     │
     └─→ Combine & Rank
           ├─ Normalize scores (0-1)
           ├─ Apply weights (70/30)
           ├─ Merge duplicates
           └─ Return top N results

Why Hybrid?

Vector search alone:

  • Great for semantic understanding
  • Finds related concepts
  • Can miss exact keywords
  • May return too broad results

Metadata search alone:

  • Precise keyword matching
  • Fast structured filtering
  • Misses semantic relationships
  • Requires exact terms

Hybrid search:

  • Best of both worlds
  • Balances precision and recall
  • More robust to query variations
  • Better overall relevance

Vector Search

Vector search finds memories based on semantic meaning, not just keyword matching.

How It Works

1. Query Embedding

Your query is converted to a 1536-dimensional vector:

terminal
Query: "How to optimize database performance"
        ↓
Embedding: [0.23, -0.15, 0.78, ..., 0.42]

2. Similarity Calculation

The system compares your query vector against all memory chunk vectors:

terminal
Distance = Euclidean distance between vectors
Similarity = 1 / (1 + distance)

Lower distance = Higher similarity = Better match

3. Ranking

Chunks are ranked by similarity score:

terminal
Chunk A: similarity = 0.92 (excellent match)
Chunk B: similarity = 0.78 (good match)
Chunk C: similarity = 0.45 (weak match)

Semantic Matching Examples

Vector search understands related concepts:

terminal
Query: "increase revenue"
Matches:
  - "boost sales figures"
  - "improve profit margins"
  - "grow quarterly earnings"
  - "enhance monetization"
terminal
Query: "project planning"
Matches:
  - "roadmap development"
  - "milestone scheduling"
  - "sprint organization"
  - "delivery timelines"

What You Need to Know

When you search in Kybernesis:

  • Your query is converted to a numerical representation
  • This is compared against all your stored memories
  • The most semantically similar results are returned
  • Typically returns 10-15 results by default

Performance Characteristics

  • Speed: ~50-200ms for queries
  • Capacity: Millions of vectors
  • Accuracy: High for semantic queries
  • Cache: 30-second TTL for repeated queries

Metadata Filtering

Metadata search finds memories based on structured attributes.

Searchable Fields

Title Matching

Search memory titles for keywords:

terminal
Query: "budget"
Matches titles containing:
  - "Q4 Budget Analysis"
  - "2025 Budget Planning"
  - "Engineering Budget Review"

Tag Filtering

Filter by exact tag matches:

terminal
Tags: ["2025-planning", "high-priority"]
Returns only memories with BOTH tags

Source Type

Filter by memory origin:

terminal
Source: "upload"
Returns only uploaded files

Source: "connector"
Returns only synced content

Priority Range

Filter by importance score:

terminal
Priority >= 0.7
Returns high-priority memories only

Tier Filtering

Filter by storage tier:

terminal
Tier: "hot"
Returns only hot-tier memories

Date Ranges

Filter by creation or access time:

terminal
CreatedAt >= "2024-01-01"
CreatedAt <= "2024-12-31"
Returns memories from 2024

How Metadata Scores Work

Metadata scores consider:

  • Title matching - Keywords in memory titles (40% weight)
  • Tag overlap - How many tags match your filters (30% weight)
  • Priority - Higher priority memories score better (20% weight)
  • Recency - Recently accessed memories get a boost (10% weight)

Scores range from 0.0 (no match) to 1.0 (perfect match).

Boolean Logic

Metadata filters combine with AND logic:

terminal
Tags: ["ai", "python"]
+ Source: "upload"
+ Priority >= 0.5

→ Memories matching ALL conditions

Query Syntax

Basic Queries

Simple text search:

terminal
how to deploy kubernetes

Returns memories semantically related to Kubernetes deployment.

Multi-word phrases:

terminal
"machine learning deployment"

Searches for the exact phrase (in vector space).

Questions:

terminal
What are the benefits of microservices?

Natural language questions work well with semantic search.

Query Best Practices

✅ Do

Use natural language:

terminal
How can I improve API response times?

Include context:

terminal
React performance optimization techniques for large lists

Be specific:

terminal
PostgreSQL query optimization for JOIN operations

Use domain terms:

terminal
OAuth2 authorization code flow implementation

❌ Avoid

Single keywords:

terminal
kubernetes

Too broad, use phrases instead.

Boolean operators:

terminal
docker AND kubernetes OR containers

Not supported, use natural phrases.

Wildcards:

terminal
deploy*

Not needed, semantic search handles variations.

Query Length Guidelines

LengthEffectivenessUse Case
1-2 wordsPoorToo vague
3-5 wordsGoodSpecific topics
6-15 wordsExcellentDetailed questions
16+ wordsModerateMay be overly specific

Optimal:

terminal
best practices for RESTful API design

Filtering by Tags

Tags provide powerful filtering capabilities when searching.

How Tag Filters Work

All tags required (AND logic):

terminal
tags: ["ai", "python", "tensorflow"]

Matches:
  Memory A: tags = ["ai", "python", "tensorflow", "keras"]  ✓
  Memory B: tags = ["ai", "python"]                         ✗
  Memory C: tags = ["ai", "machine-learning"]               ✗

Common Tag Patterns

By Project

terminal
tags: ["2025-planning", "product-roadmap"]

By Technology

terminal
tags: ["react", "typescript", "frontend"]

By Status

terminal
tags: ["high-priority", "in-progress"]

By Team

terminal
tags: ["engineering", "backend", "infrastructure"]

Combining Tags with Search

Tags narrow down semantic search results:

Example:

terminal
Query: "database optimization techniques"
Tags: ["postgresql", "production"]

Returns only memories that:
  ✓ Are semantically about database optimization
  AND
  ✓ Have BOTH "postgresql" AND "production" tags

This gives you the best of both worlds - semantic understanding with precise filtering.


Understanding Scores

Each search result includes multiple scores to help you understand relevance.

Score Types

Hybrid Score

Overall relevance combining vector + metadata:

terminal
hybridScore = (vectorScore * 0.7) + (metadataScore * 0.3)

Range: 0.0 to 1.0
Higher = More relevant

Interpretation:

  • 0.8 - 1.0: Excellent match (exact or highly relevant)
  • 0.6 - 0.8: Good match (relevant content)
  • 0.4 - 0.6: Moderate match (somewhat related)
  • 0.2 - 0.4: Weak match (tangentially related)
  • 0.0 - 0.2: Poor match (barely relevant)

Vector Score

Semantic similarity only:

terminal
vectorScore = 1 / (1 + euclideanDistance)

Range: 0.0 to 1.0

Interpretation:

  • 0.9 - 1.0: Semantically identical
  • 0.7 - 0.9: Semantically very similar
  • 0.5 - 0.7: Semantically related
  • < 0.5: Semantically distant

Metadata Score

Structured attribute matching:

terminal
metadataScore = weighted combination of:
  - Title match
  - Tag overlap
  - Priority alignment
  - Recency

Range: 0.0 to 1.0

Interpretation:

  • 1.0: Perfect metadata match (all filters satisfied)
  • 0.5 - 1.0: Partial match (some filters satisfied)
  • < 0.5: Weak metadata match

Understanding Result Scores

When you see search results, each includes three scores:

Example result:

terminal
Title: "Kubernetes Deployment Guide"
Hybrid Score:   0.87  ← Overall relevance (excellent)
Vector Score:   0.92  ← Semantic similarity (very high)
Metadata Score: 0.75  ← Attribute match (good)
Tags: kubernetes, devops, production

How to read scores:

  • Hybrid score is what matters most - it's the overall relevance
  • Vector score shows how semantically similar the content is
  • Metadata score shows how well tags/attributes match

Focus on memories with hybrid scores above 0.6 for best results.


Performance Tips

Optimize Query Speed

1. Limit Results Appropriately

terminal
// Too many results = slower
{ limit: 50 }  // ❌ Slow

// Optimal for most use cases
{ limit: 10 }  // ✓ Fast

2. Use Tag Filters

terminal
// Broad search = slower
{ query: "deployment" }  // ❌ Searches everything

// Filtered search = faster
{ query: "deployment", tags: ["kubernetes"] }  // ✓ Focused

3. Cache Repeated Queries

The system automatically caches for 30 seconds:

terminal
Query 1: "deployment strategies" → 150ms
Query 2: "deployment strategies" → 5ms (cached)

4. Avoid Overly Generic Queries

terminal
// Too generic = many results to rank
{ query: "data" }  // ❌ Slow

// Specific = fewer results to rank
{ query: "database migration strategies" }  // ✓ Fast

Improve Result Quality

1. Be Specific

terminal
❌ "kubernetes"
✓ "kubernetes deployment best practices for production"

2. Use Multi-Word Queries

terminal
❌ "api"
✓ "RESTful API design patterns"

3. Include Context

terminal
❌ "optimize performance"
✓ "optimize React component rendering performance"

4. Combine with Tags

terminal
{
  query: "deployment strategies",
  tags: ["kubernetes", "production"]  // Narrows results
}

Performance Benchmarks

OperationLatency (p50)Latency (p95)
Vector search50ms150ms
Metadata search20ms80ms
Hybrid search (uncached)100ms250ms
Hybrid search (cached)5ms10ms

For Developers

This guide focuses on using search through the Kybernesis interface.

If you need programmatic access to search (API/SDK), see the API Reference for:

  • REST API endpoints
  • Request/response schemas
  • TypeScript/JavaScript SDK examples
  • Python client examples
  • Authentication and rate limits

Search Tips

Refining Your Search

Start broad, then narrow:

terminal
Query 1: "deployment"
→ Too many results

Query 2: "kubernetes deployment"
→ Better, still broad

Query 3: "kubernetes deployment strategies for microservices"
→ Focused, high-quality results

Looking for Multiple Topics

If you need information on multiple related topics:

  1. Search for each topic separately
  2. Note common tags in the results
  3. Refine your next search using those tags

Example:

terminal
First search: "API design"
Common tags in results: ["rest", "graphql", "authentication"]

Refined search: "API design authentication patterns"
Add tags: ["rest", "graphql"]
→ More focused results

Troubleshooting

Issue: No Results Returned

Possible causes:

  1. Query too specific
  2. No memories match filters
  3. Embeddings not generated

Solutions:

Broaden your query:

terminal
❌ "kubernetes helm chart v3 deployment to AWS EKS"
✓ "kubernetes deployment strategies"

Remove tag filters:

terminal
// Try without tags first
{ query: "deployment" }

// Then add tags back incrementally
{ query: "deployment", tags: ["kubernetes"] }

Check if memories exist:

terminal
curl https://api.kybernesis.com/api/memories?limit=10

Issue: Irrelevant Results

Possible causes:

  1. Query too vague
  2. Low-quality memory content
  3. Missing embeddings

Solutions:

Add more context to query:

terminal
❌ "performance"
✓ "database query performance optimization"

Use tag filters:

terminal
{
  query: "optimization",
  tags: ["database", "postgresql"]  // Focus results
}

Check result scores:

terminal
results.forEach(r => {
  if (r.hybridScore < 0.5) {
    console.warn("Low relevance:", r.memory.title);
  }
});

Issue: Slow Queries

Possible causes:

  1. Too many results requested
  2. First query (not cached)
  3. System load

Solutions:

Reduce limit:

terminal
{ query: "...", limit: 10 }  // Instead of 50

Use tag filters:

terminal
{ query: "...", tags: ["specific-tag"] }  // Narrows search space

Monitor latency:

terminal
const start = Date.now();
const results = await search({ query: "..." });
console.log(`Latency: ${Date.now() - start}ms`);

Issue: Duplicate Results

Possible causes:

  1. Same memory with multiple matching chunks
  2. Similar memories with slight variations

Solutions:

Deduplicate by memoryId:

terminal
const unique = Array.from(
  new Map(results.map(r => [r.memoryId, r])).values()
);

Deduplicate by similarity:

terminal
const deduped = results.filter((result, index) => {
  return !results.slice(0, index).some(prev =>
    calculateSimilarity(result, prev) > 0.95
  );
});

Issue: Missing Expected Results

Possible causes:

  1. Memory in archive tier
  2. Memory not yet indexed
  3. Embedding version mismatch

Solutions:

Check memory status:

terminal
curl https://api.kybernesis.com/api/memories/{memoryId}

Verify tier:

terminal
// Check if memory is archived
if (memory.tier === "archive") {
  // May not appear in search
}

Re-index memory:

terminal
curl -X POST https://api.kybernesis.com/api/memories/{memoryId}/reindex

Next Steps

  • UI Guide - Learn to search using the topology interface
  • Core Concepts - Understand how retrieval fits into the system
  • Memory System - Dive deeper into embeddings and chunking