Retrieval Guide
Learn how to search and retrieve memories using Kybernesis's hybrid search system.
Table of Contents
- How Hybrid Search Works
- Vector Search
- Metadata Filtering
- Query Syntax
- Filtering by Tags
- Understanding Scores
- Performance Tips
- For Developers
- Search Tips
- Troubleshooting
How Hybrid Search Works
Kybernesis uses hybrid retrieval that combines two search methods for optimal results:
- Vector Search - Semantic similarity using AI embeddings
- Metadata Filtering - Structured search on attributes
The Hybrid Process
User Query: "machine learning deployment strategies"
│
├─→ Vector Search (70% weight)
│ ├─ Convert query to embedding
│ ├─ Compare to all memory chunks
│ └─ Find semantic matches
│
├─→ Metadata Search (30% weight)
│ ├─ Search titles
│ ├─ Filter by tags
│ └─ Match attributes
│
└─→ Combine & Rank
├─ Normalize scores (0-1)
├─ Apply weights (70/30)
├─ Merge duplicates
└─ Return top N results
Why Hybrid?
Vector search alone:
- Great for semantic understanding
- Finds related concepts
- Can miss exact keywords
- May return too broad results
Metadata search alone:
- Precise keyword matching
- Fast structured filtering
- Misses semantic relationships
- Requires exact terms
Hybrid search:
- Best of both worlds
- Balances precision and recall
- More robust to query variations
- Better overall relevance
Vector Search
Vector search finds memories based on semantic meaning, not just keyword matching.
How It Works
1. Query Embedding
Your query is converted to a 1536-dimensional vector:
Query: "How to optimize database performance"
↓
Embedding: [0.23, -0.15, 0.78, ..., 0.42]
2. Similarity Calculation
The system compares your query vector against all memory chunk vectors:
Distance = Euclidean distance between vectors
Similarity = 1 / (1 + distance)
Lower distance = Higher similarity = Better match
3. Ranking
Chunks are ranked by similarity score:
Chunk A: similarity = 0.92 (excellent match)
Chunk B: similarity = 0.78 (good match)
Chunk C: similarity = 0.45 (weak match)
Semantic Matching Examples
Vector search understands related concepts:
Query: "increase revenue"
Matches:
- "boost sales figures"
- "improve profit margins"
- "grow quarterly earnings"
- "enhance monetization"
Query: "project planning"
Matches:
- "roadmap development"
- "milestone scheduling"
- "sprint organization"
- "delivery timelines"
What You Need to Know
When you search in Kybernesis:
- Your query is converted to a numerical representation
- This is compared against all your stored memories
- The most semantically similar results are returned
- Typically returns 10-15 results by default
Performance Characteristics
- Speed: ~50-200ms for queries
- Capacity: Millions of vectors
- Accuracy: High for semantic queries
- Cache: 30-second TTL for repeated queries
Metadata Filtering
Metadata search finds memories based on structured attributes.
Searchable Fields
Title Matching
Search memory titles for keywords:
Query: "budget"
Matches titles containing:
- "Q4 Budget Analysis"
- "2025 Budget Planning"
- "Engineering Budget Review"
Tag Filtering
Filter by exact tag matches:
Tags: ["2025-planning", "high-priority"]
Returns only memories with BOTH tags
Source Type
Filter by memory origin:
Source: "upload"
Returns only uploaded files
Source: "connector"
Returns only synced content
Priority Range
Filter by importance score:
Priority >= 0.7
Returns high-priority memories only
Tier Filtering
Filter by storage tier:
Tier: "hot"
Returns only hot-tier memories
Date Ranges
Filter by creation or access time:
CreatedAt >= "2024-01-01"
CreatedAt <= "2024-12-31"
Returns memories from 2024
How Metadata Scores Work
Metadata scores consider:
- Title matching - Keywords in memory titles (40% weight)
- Tag overlap - How many tags match your filters (30% weight)
- Priority - Higher priority memories score better (20% weight)
- Recency - Recently accessed memories get a boost (10% weight)
Scores range from 0.0 (no match) to 1.0 (perfect match).
Boolean Logic
Metadata filters combine with AND logic:
Tags: ["ai", "python"]
+ Source: "upload"
+ Priority >= 0.5
→ Memories matching ALL conditions
Query Syntax
Basic Queries
Simple text search:
how to deploy kubernetes
Returns memories semantically related to Kubernetes deployment.
Multi-word phrases:
"machine learning deployment"
Searches for the exact phrase (in vector space).
Questions:
What are the benefits of microservices?
Natural language questions work well with semantic search.
Query Best Practices
✅ Do
Use natural language:
How can I improve API response times?
Include context:
React performance optimization techniques for large lists
Be specific:
PostgreSQL query optimization for JOIN operations
Use domain terms:
OAuth2 authorization code flow implementation
❌ Avoid
Single keywords:
kubernetes
Too broad, use phrases instead.
Boolean operators:
docker AND kubernetes OR containers
Not supported, use natural phrases.
Wildcards:
deploy*
Not needed, semantic search handles variations.
Query Length Guidelines
| Length | Effectiveness | Use Case |
|---|---|---|
| 1-2 words | Poor | Too vague |
| 3-5 words | Good | Specific topics |
| 6-15 words | Excellent | Detailed questions |
| 16+ words | Moderate | May be overly specific |
Optimal:
best practices for RESTful API design
Filtering by Tags
Tags provide powerful filtering capabilities when searching.
How Tag Filters Work
All tags required (AND logic):
tags: ["ai", "python", "tensorflow"]
Matches:
Memory A: tags = ["ai", "python", "tensorflow", "keras"] ✓
Memory B: tags = ["ai", "python"] ✗
Memory C: tags = ["ai", "machine-learning"] ✗
Common Tag Patterns
By Project
tags: ["2025-planning", "product-roadmap"]
By Technology
tags: ["react", "typescript", "frontend"]
By Status
tags: ["high-priority", "in-progress"]
By Team
tags: ["engineering", "backend", "infrastructure"]
Combining Tags with Search
Tags narrow down semantic search results:
Example:
Query: "database optimization techniques"
Tags: ["postgresql", "production"]
Returns only memories that:
✓ Are semantically about database optimization
AND
✓ Have BOTH "postgresql" AND "production" tags
This gives you the best of both worlds - semantic understanding with precise filtering.
Understanding Scores
Each search result includes multiple scores to help you understand relevance.
Score Types
Hybrid Score
Overall relevance combining vector + metadata:
hybridScore = (vectorScore * 0.7) + (metadataScore * 0.3)
Range: 0.0 to 1.0
Higher = More relevant
Interpretation:
- 0.8 - 1.0: Excellent match (exact or highly relevant)
- 0.6 - 0.8: Good match (relevant content)
- 0.4 - 0.6: Moderate match (somewhat related)
- 0.2 - 0.4: Weak match (tangentially related)
- 0.0 - 0.2: Poor match (barely relevant)
Vector Score
Semantic similarity only:
vectorScore = 1 / (1 + euclideanDistance)
Range: 0.0 to 1.0
Interpretation:
- 0.9 - 1.0: Semantically identical
- 0.7 - 0.9: Semantically very similar
- 0.5 - 0.7: Semantically related
- < 0.5: Semantically distant
Metadata Score
Structured attribute matching:
metadataScore = weighted combination of:
- Title match
- Tag overlap
- Priority alignment
- Recency
Range: 0.0 to 1.0
Interpretation:
- 1.0: Perfect metadata match (all filters satisfied)
- 0.5 - 1.0: Partial match (some filters satisfied)
- < 0.5: Weak metadata match
Understanding Result Scores
When you see search results, each includes three scores:
Example result:
Title: "Kubernetes Deployment Guide"
Hybrid Score: 0.87 ← Overall relevance (excellent)
Vector Score: 0.92 ← Semantic similarity (very high)
Metadata Score: 0.75 ← Attribute match (good)
Tags: kubernetes, devops, production
How to read scores:
- Hybrid score is what matters most - it's the overall relevance
- Vector score shows how semantically similar the content is
- Metadata score shows how well tags/attributes match
Focus on memories with hybrid scores above 0.6 for best results.
Performance Tips
Optimize Query Speed
1. Limit Results Appropriately
// Too many results = slower
{ limit: 50 } // ❌ Slow
// Optimal for most use cases
{ limit: 10 } // ✓ Fast
2. Use Tag Filters
// Broad search = slower
{ query: "deployment" } // ❌ Searches everything
// Filtered search = faster
{ query: "deployment", tags: ["kubernetes"] } // ✓ Focused
3. Cache Repeated Queries
The system automatically caches for 30 seconds:
Query 1: "deployment strategies" → 150ms
Query 2: "deployment strategies" → 5ms (cached)
4. Avoid Overly Generic Queries
// Too generic = many results to rank
{ query: "data" } // ❌ Slow
// Specific = fewer results to rank
{ query: "database migration strategies" } // ✓ Fast
Improve Result Quality
1. Be Specific
❌ "kubernetes"
✓ "kubernetes deployment best practices for production"
2. Use Multi-Word Queries
❌ "api"
✓ "RESTful API design patterns"
3. Include Context
❌ "optimize performance"
✓ "optimize React component rendering performance"
4. Combine with Tags
{
query: "deployment strategies",
tags: ["kubernetes", "production"] // Narrows results
}
Performance Benchmarks
| Operation | Latency (p50) | Latency (p95) |
|---|---|---|
| Vector search | 50ms | 150ms |
| Metadata search | 20ms | 80ms |
| Hybrid search (uncached) | 100ms | 250ms |
| Hybrid search (cached) | 5ms | 10ms |
For Developers
This guide focuses on using search through the Kybernesis interface.
If you need programmatic access to search (API/SDK), see the API Reference for:
- REST API endpoints
- Request/response schemas
- TypeScript/JavaScript SDK examples
- Python client examples
- Authentication and rate limits
Search Tips
Refining Your Search
Start broad, then narrow:
Query 1: "deployment"
→ Too many results
Query 2: "kubernetes deployment"
→ Better, still broad
Query 3: "kubernetes deployment strategies for microservices"
→ Focused, high-quality results
Looking for Multiple Topics
If you need information on multiple related topics:
- Search for each topic separately
- Note common tags in the results
- Refine your next search using those tags
Example:
First search: "API design"
Common tags in results: ["rest", "graphql", "authentication"]
Refined search: "API design authentication patterns"
Add tags: ["rest", "graphql"]
→ More focused results
Troubleshooting
Issue: No Results Returned
Possible causes:
- Query too specific
- No memories match filters
- Embeddings not generated
Solutions:
✓ Broaden your query:
❌ "kubernetes helm chart v3 deployment to AWS EKS"
✓ "kubernetes deployment strategies"
✓ Remove tag filters:
// Try without tags first
{ query: "deployment" }
// Then add tags back incrementally
{ query: "deployment", tags: ["kubernetes"] }
✓ Check if memories exist:
curl https://api.kybernesis.com/api/memories?limit=10
Issue: Irrelevant Results
Possible causes:
- Query too vague
- Low-quality memory content
- Missing embeddings
Solutions:
✓ Add more context to query:
❌ "performance"
✓ "database query performance optimization"
✓ Use tag filters:
{
query: "optimization",
tags: ["database", "postgresql"] // Focus results
}
✓ Check result scores:
results.forEach(r => {
if (r.hybridScore < 0.5) {
console.warn("Low relevance:", r.memory.title);
}
});
Issue: Slow Queries
Possible causes:
- Too many results requested
- First query (not cached)
- System load
Solutions:
✓ Reduce limit:
{ query: "...", limit: 10 } // Instead of 50
✓ Use tag filters:
{ query: "...", tags: ["specific-tag"] } // Narrows search space
✓ Monitor latency:
const start = Date.now();
const results = await search({ query: "..." });
console.log(`Latency: ${Date.now() - start}ms`);
Issue: Duplicate Results
Possible causes:
- Same memory with multiple matching chunks
- Similar memories with slight variations
Solutions:
✓ Deduplicate by memoryId:
const unique = Array.from(
new Map(results.map(r => [r.memoryId, r])).values()
);
✓ Deduplicate by similarity:
const deduped = results.filter((result, index) => {
return !results.slice(0, index).some(prev =>
calculateSimilarity(result, prev) > 0.95
);
});
Issue: Missing Expected Results
Possible causes:
- Memory in archive tier
- Memory not yet indexed
- Embedding version mismatch
Solutions:
✓ Check memory status:
curl https://api.kybernesis.com/api/memories/{memoryId}
✓ Verify tier:
// Check if memory is archived
if (memory.tier === "archive") {
// May not appear in search
}
✓ Re-index memory:
curl -X POST https://api.kybernesis.com/api/memories/{memoryId}/reindex
Next Steps
- UI Guide - Learn to search using the topology interface
- Core Concepts - Understand how retrieval fits into the system
- Memory System - Dive deeper into embeddings and chunking