Retrieval Guide
Learn how to search and retrieve memories using Kybernesis's hybrid search system.
Table of Contents
- How Hybrid Search Works
- Vector Search
- Metadata Filtering
- Query Syntax
- Filtering by Tags
- Understanding Scores
- Performance Tips
- API Examples
- Advanced Techniques
- Troubleshooting
How Hybrid Search Works
Kybernesis uses hybrid retrieval that combines three search methods plus cognitive enrichment:
- Vector Search (55% weight) - Semantic similarity using AI embeddings
- Metadata Filtering (25% weight) - Structured search on attributes
- Keyword Overlap (20% weight) - Lexical term matching
The Hybrid Process
User Query: "What does Ian work on?"
│
├─→ Vector Search (55%)
│ ├─ Convert query to embedding
│ ├─ Compare to all memory chunks
│ └─ Find semantic matches
│
├─→ Metadata Search (25%)
│ ├─ Search titles
│ ├─ Filter by tags
│ └─ Match attributes
│
├─→ Keyword Search (20%)
│ └─ Literal term overlap in content
│
└─→ Combine, Boost & Enrich
├─ Normalize and weight scores
├─ Apply fact-aware boosts (confidence-weighted)
├─ Apply graph neighbor boost (+0.08)
├─ Apply surprisal boost (+0.03 for novel facts)
├─ Fetch entity profiles (with narratives)
├─ Fetch reasoning insights (deductions/inductions)
├─ Fetch pending contradictions
└─ Return enriched results
Cognitive Context in Results
Beyond ranked memories, each retrieval response includes:
- Entity Profiles: Structured facts and narrative summaries for detected entities
- Insights: Reasoning-derived deductions and inductions (confidence >= 0.7)
- Pending Contradictions: Conflicting facts that need human resolution
- Reasoning Traces: Per-result explanations of why each memory ranked where it did
Why Hybrid?
Vector search alone:
- Great for semantic understanding
- Finds related concepts
- Can miss exact keywords
- May return too broad results
Metadata search alone:
- Precise keyword matching
- Fast structured filtering
- Misses semantic relationships
- Requires exact terms
Hybrid search:
- Best of both worlds
- Balances precision and recall
- More robust to query variations
- Better overall relevance
Vector Search
Vector search finds memories based on semantic meaning, not just keyword matching.
How It Works
1. Query Embedding
Your query is converted to a 1536-dimensional vector:
Query: "How to optimize database performance"
↓
Embedding: [0.23, -0.15, 0.78, ..., 0.42]
2. Similarity Calculation
The system compares your query vector against all memory chunk vectors:
Distance = Euclidean distance between vectors
Similarity = 1 / (1 + distance)
Lower distance = Higher similarity = Better match
3. Ranking
Chunks are ranked by similarity score:
Chunk A: similarity = 0.92 (excellent match)
Chunk B: similarity = 0.78 (good match)
Chunk C: similarity = 0.45 (weak match)
Semantic Matching Examples
Vector search understands related concepts:
Query: "increase revenue"
Matches:
- "boost sales figures"
- "improve profit margins"
- "grow quarterly earnings"
- "enhance monetization"
Query: "project planning"
Matches:
- "roadmap development"
- "milestone scheduling"
- "sprint organization"
- "delivery timelines"
Vector Search Parameters
Configurable via API:
{
query: "your search query",
limit: 10, // Number of results
orgId: "org_123" // Organization filter
}
Behind the scenes:
- Collection: "kybernesis_memories"
- Distance metric: Euclidean
- Embedding model: text-embedding-3-small
- Dimensions: 1536
Performance Characteristics
- Speed: ~50-200ms for queries
- Capacity: Millions of vectors
- Accuracy: High for semantic queries
- Cache: 30-second TTL for repeated queries
Metadata Filtering
Metadata search finds memories based on structured attributes.
Searchable Fields
Title Matching
Search memory titles for keywords:
Query: "budget"
Matches titles containing:
- "Q4 Budget Analysis"
- "2025 Budget Planning"
- "Engineering Budget Review"
Tag Filtering
Filter by exact tag matches:
Tags: ["2025-planning", "high-priority"]
Returns only memories with BOTH tags
Source Type
Filter by memory origin:
Source: "upload"
Returns only uploaded files
Source: "connector"
Returns only synced content
Priority Range
Filter by importance score:
Priority >= 0.7
Returns high-priority memories only
Tier Filtering
Filter by storage tier:
Tier: "hot"
Returns only hot-tier memories
Date Ranges
Filter by creation or access time:
CreatedAt >= "2024-01-01"
CreatedAt <= "2024-12-31"
Returns memories from 2024
Metadata Score Calculation
metadataScore = weighted_average([
titleMatch * 0.4, // Title keyword presence
tagMatch * 0.3, // Tag overlap
priorityMatch * 0.2, // Priority alignment
recencyBoost * 0.1 // Recent creation/access
]);
// Range: 0.0 to 1.0
Boolean Logic
Metadata filters combine with AND logic:
Tags: ["ai", "python"]
+ Source: "upload"
+ Priority >= 0.5
→ Memories matching ALL conditions
Query Syntax
Basic Queries
Simple text search:
how to deploy kubernetes
Returns memories semantically related to Kubernetes deployment.
Multi-word phrases:
"machine learning deployment"
Searches for the exact phrase (in vector space).
Questions:
What are the benefits of microservices?
Natural language questions work well with semantic search.
Query Best Practices
✅ Do
Use natural language:
How can I improve API response times?
Include context:
React performance optimization techniques for large lists
Be specific:
PostgreSQL query optimization for JOIN operations
Use domain terms:
OAuth2 authorization code flow implementation
❌ Avoid
Single keywords:
kubernetes
Too broad, use phrases instead.
Boolean operators:
docker AND kubernetes OR containers
Not supported, use natural phrases.
Wildcards:
deploy*
Not needed, semantic search handles variations.
Query Length Guidelines
| Length | Effectiveness | Use Case |
|---|---|---|
| 1-2 words | Poor | Too vague |
| 3-5 words | Good | Specific topics |
| 6-15 words | Excellent | Detailed questions |
| 16+ words | Moderate | May be overly specific |
Optimal:
best practices for RESTful API design
Filtering by Tags
Tags provide powerful filtering capabilities.
Tag Filter Syntax
API request:
POST /retrieval/hybrid
{
"query": "deployment strategies",
"tags": ["kubernetes", "production"],
"limit": 10
}
cURL example:
curl -X POST https://api.kybernesis.com/retrieval/hybrid \
-H "Content-Type: application/json" \
-d '{
"query": "deployment strategies",
"tags": ["kubernetes", "production"],
"orgId": "org_123",
"limit": 10
}'
Tag Matching Behavior
All tags required (AND logic):
tags: ["ai", "python", "tensorflow"]
Matches:
Memory A: tags = ["ai", "python", "tensorflow", "keras"] ✓
Memory B: tags = ["ai", "python"] ✗
Memory C: tags = ["ai", "machine-learning"] ✗
Common Tag Patterns
By Project
tags: ["2025-planning", "product-roadmap"]
By Technology
tags: ["react", "typescript", "frontend"]
By Status
tags: ["high-priority", "in-progress"]
By Team
tags: ["engineering", "backend", "infrastructure"]
Combining Tags with Search
Narrow semantic search with tags:
{
query: "database optimization techniques",
tags: ["postgresql", "production"],
limit: 10
}
This returns:
- Memories semantically about database optimization
- AND tagged with both "postgresql" AND "production"
Tag Discovery
Finding available tags:
GET /api/tags?orgId=org_123
Returns list of all tags used in your organization.
Understanding Scores
Each search result includes multiple scores to help you understand relevance.
Score Types
Hybrid Score
Overall relevance combining vector + metadata:
hybridScore = (vectorScore * 0.7) + (metadataScore * 0.3)
Range: 0.0 to 1.0
Higher = More relevant
Interpretation:
- 0.8 - 1.0: Excellent match (exact or highly relevant)
- 0.6 - 0.8: Good match (relevant content)
- 0.4 - 0.6: Moderate match (somewhat related)
- 0.2 - 0.4: Weak match (tangentially related)
- 0.0 - 0.2: Poor match (barely relevant)
Vector Score
Semantic similarity only:
vectorScore = 1 / (1 + euclideanDistance)
Range: 0.0 to 1.0
Interpretation:
- 0.9 - 1.0: Semantically identical
- 0.7 - 0.9: Semantically very similar
- 0.5 - 0.7: Semantically related
- < 0.5: Semantically distant
Metadata Score
Structured attribute matching:
metadataScore = weighted combination of:
- Title match
- Tag overlap
- Priority alignment
- Recency
Range: 0.0 to 1.0
Interpretation:
- 1.0: Perfect metadata match (all filters satisfied)
- 0.5 - 1.0: Partial match (some filters satisfied)
- < 0.5: Weak metadata match
Score Normalization
Scores are normalized to ensure fair comparison:
// Vector scores normalized by maximum
maxVector = 0.95
score1 = 0.85 → normalized = 0.85 / 0.95 = 0.89
// Metadata scores normalized by maximum
maxMetadata = 0.80
score2 = 0.60 → normalized = 0.60 / 0.80 = 0.75
Example Result
{
"memoryId": "mem_abc123",
"hybridScore": 0.87, // Overall: excellent match
"vectorScore": 0.92, // Semantic: very similar
"metadataScore": 0.75, // Metadata: good match
"memory": {
"title": "Kubernetes Deployment Guide",
"tags": ["kubernetes", "devops", "production"]
},
"chunks": [
{
"chunkId": "chunk_def456",
"similarity": 0.92,
"document": "To deploy to Kubernetes..."
}
]
}
Performance Tips
Optimize Query Speed
1. Limit Results Appropriately
// Too many results = slower
{ limit: 50 } // ❌ Slow
// Optimal for most use cases
{ limit: 10 } // ✓ Fast
2. Use Tag Filters
// Broad search = slower
{ query: "deployment" } // ❌ Searches everything
// Filtered search = faster
{ query: "deployment", tags: ["kubernetes"] } // ✓ Focused
3. Cache Repeated Queries
The system automatically caches for 30 seconds:
Query 1: "deployment strategies" → 150ms
Query 2: "deployment strategies" → 5ms (cached)
4. Avoid Overly Generic Queries
// Too generic = many results to rank
{ query: "data" } // ❌ Slow
// Specific = fewer results to rank
{ query: "database migration strategies" } // ✓ Fast
Improve Result Quality
1. Be Specific
❌ "kubernetes"
✓ "kubernetes deployment best practices for production"
2. Use Multi-Word Queries
❌ "api"
✓ "RESTful API design patterns"
3. Include Context
❌ "optimize performance"
✓ "optimize React component rendering performance"
4. Combine with Tags
{
query: "deployment strategies",
tags: ["kubernetes", "production"] // Narrows results
}
Batch Queries
For multiple searches, send requests in parallel:
// Sequential (slow)
const result1 = await search("query1");
const result2 = await search("query2");
const result3 = await search("query3");
// Parallel (fast)
const [result1, result2, result3] = await Promise.all([
search("query1"),
search("query2"),
search("query3")
]);
Performance Benchmarks
| Operation | Latency (p50) | Latency (p95) |
|---|---|---|
| Vector search | 50ms | 150ms |
| Metadata search | 20ms | 80ms |
| Hybrid search (uncached) | 100ms | 250ms |
| Hybrid search (cached) | 5ms | 10ms |
API Examples
Basic Search
HTTP Request:
curl -X POST https://api.kybernesis.com/retrieval/hybrid \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"orgId": "org_123",
"query": "how to optimize database queries",
"limit": 10
}'
Response:
{
"status": "ok",
"orgId": "org_123",
"query": "how to optimize database queries",
"limit": 10,
"results": [
{
"memoryId": "mem_abc123",
"hybridScore": 0.87,
"vectorScore": 0.92,
"metadataScore": 0.75,
"memory": {
"id": "mem_abc123",
"title": "Database Optimization Guide",
"summary": "Comprehensive guide to optimizing SQL queries...",
"tags": ["database", "postgresql", "optimization"],
"source": "upload",
"priority": 0.8,
"tier": "hot",
"createdAtIso": "2024-10-15T10:30:00Z"
},
"chunks": [
{
"chunkId": "chunk_def456",
"similarity": 0.92,
"document": "To optimize database queries, consider using indexes..."
}
]
}
],
"diagnostics": {
"vector": [
{
"chunkId": "chunk_def456",
"distance": 0.087,
"similarity": 0.92
}
],
"metadata": [
{
"memoryId": "mem_abc123",
"score": 0.75
}
]
}
}
Search with Tag Filters
curl -X POST https://api.kybernesis.com/retrieval/hybrid \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"orgId": "org_123",
"query": "deployment strategies",
"tags": ["kubernetes", "production"],
"limit": 10
}'
Search with Summaries
curl -X POST https://api.kybernesis.com/retrieval/hybrid \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"orgId": "org_123",
"query": "API authentication methods",
"includeSummaries": true,
"limit": 5
}'
TypeScript SDK Example
import { KybernesisClient } from '@kybernesis/sdk';
const client = new KybernesisClient({
apiKey: process.env.KYBERNESIS_API_KEY,
orgId: 'org_123'
});
// Basic search
const results = await client.search({
query: "machine learning deployment",
limit: 10
});
// Search with filters
const filtered = await client.search({
query: "API design patterns",
tags: ["rest", "graphql"],
limit: 15
});
// Process results
for (const result of results.results) {
console.log(`[${result.hybridScore.toFixed(2)}] ${result.memory.title}`);
console.log(` Tags: ${result.memory.tags.join(', ')}`);
console.log(` Summary: ${result.memory.summary}`);
}
JavaScript Fetch Example
async function searchMemories(query, tags = []) {
const response = await fetch('https://api.kybernesis.com/retrieval/hybrid', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${YOUR_API_KEY}`
},
body: JSON.stringify({
orgId: 'org_123',
query,
tags,
limit: 10,
includeSummaries: true
})
});
const data = await response.json();
return data.results;
}
// Usage
const results = await searchMemories(
"kubernetes deployment",
["production", "devops"]
);
Python Example
import requests
def search_memories(query, tags=None, limit=10):
url = "https://api.kybernesis.com/retrieval/hybrid"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {YOUR_API_KEY}"
}
payload = {
"orgId": "org_123",
"query": query,
"tags": tags or [],
"limit": limit,
"includeSummaries": True
}
response = requests.post(url, json=payload, headers=headers)
response.raise_for_status()
return response.json()["results"]
# Usage
results = search_memories(
query="database optimization techniques",
tags=["postgresql", "performance"],
limit=15
)
for result in results:
print(f"[{result['hybridScore']:.2f}] {result['memory']['title']}")
print(f" Tags: {', '.join(result['memory']['tags'])}")
Advanced Techniques
Query Refinement
Start broad, then narrow:
Query 1: "deployment"
→ Too many results
Query 2: "kubernetes deployment"
→ Better, still broad
Query 3: "kubernetes deployment strategies for microservices"
→ Focused, high-quality results
Multi-Stage Retrieval
Retrieve, analyze, then refine:
// Stage 1: Broad search
const initial = await search({
query: "API design",
limit: 20
});
// Stage 2: Analyze tags from results
const commonTags = extractCommonTags(initial.results);
// → ["rest", "graphql", "authentication"]
// Stage 3: Refined search
const refined = await search({
query: "API design authentication patterns",
tags: commonTags.slice(0, 2),
limit: 10
});
Re-Ranking Results
Custom scoring for specific use cases:
const results = await search({ query: "..." });
// Re-rank by recency
const reranked = results.sort((a, b) =>
new Date(b.memory.createdAtIso).getTime() -
new Date(a.memory.createdAtIso).getTime()
);
// Re-rank by priority
const byPriority = results.sort((a, b) =>
b.memory.priority - a.memory.priority
);
Combining Multiple Queries
Search for multiple concepts:
const [backend, frontend, devops] = await Promise.all([
search({ query: "backend optimization", tags: ["api"] }),
search({ query: "frontend performance", tags: ["react"] }),
search({ query: "deployment automation", tags: ["kubernetes"] })
]);
// Merge and deduplicate
const combined = mergeResults([backend, frontend, devops]);
Troubleshooting
Issue: No Results Returned
Possible causes:
- Query too specific
- No memories match filters
- Embeddings not generated
Solutions:
✓ Broaden your query:
❌ "kubernetes helm chart v3 deployment to AWS EKS"
✓ "kubernetes deployment strategies"
✓ Remove tag filters:
// Try without tags first
{ query: "deployment" }
// Then add tags back incrementally
{ query: "deployment", tags: ["kubernetes"] }
✓ Check if memories exist:
curl https://api.kybernesis.com/api/memories?limit=10
Issue: Irrelevant Results
Possible causes:
- Query too vague
- Low-quality memory content
- Missing embeddings
Solutions:
✓ Add more context to query:
❌ "performance"
✓ "database query performance optimization"
✓ Use tag filters:
{
query: "optimization",
tags: ["database", "postgresql"] // Focus results
}
✓ Check result scores:
results.forEach(r => {
if (r.hybridScore < 0.5) {
console.warn("Low relevance:", r.memory.title);
}
});
Issue: Slow Queries
Possible causes:
- Too many results requested
- First query (not cached)
- System load
Solutions:
✓ Reduce limit:
{ query: "...", limit: 10 } // Instead of 50
✓ Use tag filters:
{ query: "...", tags: ["specific-tag"] } // Narrows search space
✓ Monitor latency:
const start = Date.now();
const results = await search({ query: "..." });
console.log(`Latency: ${Date.now() - start}ms`);
Issue: Duplicate Results
Possible causes:
- Same memory with multiple matching chunks
- Similar memories with slight variations
Solutions:
✓ Deduplicate by memoryId:
const unique = Array.from(
new Map(results.map(r => [r.memoryId, r])).values()
);
✓ Deduplicate by similarity:
const deduped = results.filter((result, index) => {
return !results.slice(0, index).some(prev =>
calculateSimilarity(result, prev) > 0.95
);
});
Issue: Missing Expected Results
Possible causes:
- Memory in archive tier
- Memory not yet indexed
- Embedding version mismatch
Solutions:
✓ Check memory status:
curl https://api.kybernesis.com/api/memories/{memoryId}
✓ Verify tier:
// Check if memory is archived
if (memory.tier === "archive") {
// May not appear in search
}
✓ Re-index memory:
curl -X POST https://api.kybernesis.com/api/memories/{memoryId}/reindex
Next Steps
- UI Guide - Learn to search using the topology interface
- Core Concepts - Understand how retrieval fits into the system
- Memory System - Dive deeper into embeddings and chunking