RAG Applications
Retrieval-Augmented Generation (RAG) combines document retrieval with AI generation to provide accurate, context-aware responses.
Overview
Section titled “Overview”RAG enhances LLM responses by:
- Retrieving relevant documents from a knowledge base
- Augmenting the prompt with retrieved context
- Generating responses grounded in the retrieved information
SolidRusT AI Data Layer
Section titled “SolidRusT AI Data Layer”The SolidRusT AI platform includes a built-in data layer with multiple query types:
| Endpoint | Description | Best For |
|---|---|---|
POST /data/v1/query/semantic | Vector similarity search | Finding conceptually similar content |
POST /data/v1/query/keyword | Full-text keyword search | Exact term matching |
POST /data/v1/query/hybrid | Combined semantic + knowledge graph | Complex queries needing both approaches |
POST /data/v1/query/knowledge-graph | Entity relationship traversal | Exploring connections between concepts |
Semantic Search
Section titled “Semantic Search”Find documents based on meaning, not just keywords.
curl -X POST "https://api.solidrust.ai/data/v1/query/semantic" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "query": "How do I implement authentication?", "limit": 5, "min_score": 0.5 }'Python
Section titled “Python”import requests
API_KEY = "YOUR_API_KEY"BASE_URL = "https://api.solidrust.ai"
def semantic_search(query: str, limit: int = 10, min_score: float = 0.5) -> dict: """Search the knowledge base using semantic similarity.""" response = requests.post( f"{BASE_URL}/data/v1/query/semantic", headers={ "Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json" }, json={ "query": query, "limit": limit, "min_score": min_score } ) return response.json()
# Search for relevant documentsresults = semantic_search("How do I implement authentication?")
print(f"Found {results['total']} results in {results['latency_ms']}ms")for result in results["results"]: print(f"\nScore: {result['score']:.2f}") print(f"Source: {result['metadata'].get('source', 'unknown')}") print(f"Content: {result['content'][:200]}...")JavaScript
Section titled “JavaScript”const API_KEY = 'YOUR_API_KEY';const BASE_URL = 'https://api.solidrust.ai';
async function semanticSearch(query, limit = 10, minScore = 0.5) { const response = await fetch(`${BASE_URL}/data/v1/query/semantic`, { method: 'POST', headers: { Authorization: `Bearer ${API_KEY}`, 'Content-Type': 'application/json', }, body: JSON.stringify({ query, limit, min_score: minScore, }), }); return response.json();}
// Usageconst results = await semanticSearch('How do I implement authentication?');console.log(`Found ${results.total} results in ${results.latency_ms}ms`);
results.results.forEach((result) => { console.log(`\nScore: ${result.score.toFixed(2)}`); console.log(`Source: ${result.metadata?.source || 'unknown'}`); console.log(`Content: ${result.content.substring(0, 200)}...`);});Filtering by Source
Section titled “Filtering by Source”Narrow results to specific document sources:
results = requests.post( f"{BASE_URL}/data/v1/query/semantic", headers={"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}, json={ "query": "API rate limits", "sources": ["api-docs", "guides"], # Only search these sources "limit": 10, "min_score": 0.6 }).json()Keyword Search
Section titled “Keyword Search”Traditional full-text search with typo tolerance.
curl -X POST "https://api.solidrust.ai/data/v1/query/keyword" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "query": "rate limit 429", "limit": 10, "sort": "relevance" }'Python
Section titled “Python”def keyword_search(query: str, limit: int = 10, sort: str = "relevance") -> dict: """Full-text keyword search with typo tolerance.""" response = requests.post( f"{BASE_URL}/data/v1/query/keyword", headers={ "Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json" }, json={ "query": query, "limit": limit, "sort": sort # "relevance" or "date" } ) return response.json()
# Search for exact termsresults = keyword_search("rate limit 429 error")
for result in results["results"]: print(f"Title: {result['metadata'].get('title', 'Untitled')}") print(f"Content: {result['content'][:200]}...")Hybrid Search
Section titled “Hybrid Search”Combines semantic similarity with knowledge graph relationships for comprehensive results.
curl -X POST "https://api.solidrust.ai/data/v1/query/hybrid" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "query": "Python SDK authentication", "semantic_weight": 0.7, "graph_weight": 0.3, "limit": 10 }'Python
Section titled “Python”def hybrid_search( query: str, semantic_weight: float = 0.7, graph_weight: float = 0.3, entity_boost: list = None, limit: int = 10) -> dict: """Combined semantic and knowledge graph search.""" payload = { "query": query, "semantic_weight": semantic_weight, "graph_weight": graph_weight, "limit": limit }
if entity_boost: payload["entity_boost"] = entity_boost
response = requests.post( f"{BASE_URL}/data/v1/query/hybrid", headers={ "Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json" }, json=payload ) return response.json()
# Balance semantic and graph resultsresults = hybrid_search( query="How to use Python SDK for embeddings", semantic_weight=0.7, # 70% semantic similarity graph_weight=0.3, # 30% knowledge graph entity_boost=["Python", "embeddings"] # Boost these entities)
for result in results["results"]: print(f"Score: {result['score']:.2f} - {result['content'][:100]}...")Knowledge Graph Queries
Section titled “Knowledge Graph Queries”Explore entity relationships in the knowledge base.
curl -X POST "https://api.solidrust.ai/data/v1/query/knowledge-graph" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "entity": "Python", "relationship_types": ["uses", "implements", "related_to"], "direction": "both", "depth": 2, "limit": 50 }'Python
Section titled “Python”def knowledge_graph_query( entity: str, relationship_types: list = None, direction: str = "both", depth: int = 1, limit: int = 50) -> dict: """Query entity relationships in the knowledge graph.""" payload = { "entity": entity, "direction": direction, # "outgoing", "incoming", or "both" "depth": depth, # How many hops to traverse (1-5) "limit": limit }
if relationship_types: payload["relationship_types"] = relationship_types
response = requests.post( f"{BASE_URL}/data/v1/query/knowledge-graph", headers={ "Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json" }, json=payload ) return response.json()
# Find all entities related to "Authentication"result = knowledge_graph_query( entity="Authentication", direction="both", depth=2)
print(f"Found {len(result['entities'])} entities")print(f"Found {len(result['relationships'])} relationships")
# Print entitiesprint("\nEntities:")for entity in result["entities"]: print(f" - {entity['name']} ({entity['type']})")
# Print relationshipsprint("\nRelationships:")for rel in result["relationships"]: print(f" {rel['source']} --[{rel['type']}]--> {rel['target']}")JavaScript
Section titled “JavaScript”async function knowledgeGraphQuery(entity, options = {}) { const payload = { entity, direction: options.direction || 'both', depth: options.depth || 1, limit: options.limit || 50, };
if (options.relationshipTypes) { payload.relationship_types = options.relationshipTypes; }
const response = await fetch(`${BASE_URL}/data/v1/query/knowledge-graph`, { method: 'POST', headers: { Authorization: `Bearer ${API_KEY}`, 'Content-Type': 'application/json', }, body: JSON.stringify(payload), }); return response.json();}
// Find related conceptsconst result = await knowledgeGraphQuery('API', { relationshipTypes: ['uses', 'related_to'], depth: 2,});
console.log('Related entities:');result.entities.forEach((e) => console.log(` - ${e.name} (${e.type})`));Complete RAG Pattern
Section titled “Complete RAG Pattern”Combine retrieval with generation:
from openai import OpenAIimport requests
# Initialize clientsllm = OpenAI( api_key="YOUR_API_KEY", base_url="https://api.solidrust.ai/v1")
API_KEY = "YOUR_API_KEY"BASE_URL = "https://api.solidrust.ai"
def rag_answer(question: str) -> str: """Answer a question using RAG."""
# 1. Search for relevant context using hybrid search search_results = requests.post( f"{BASE_URL}/data/v1/query/hybrid", headers={ "Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json" }, json={ "query": question, "semantic_weight": 0.7, "graph_weight": 0.3, "limit": 5 } ).json()
# 2. Build context from retrieved documents context_parts = [] for result in search_results.get("results", []): source = result.get("metadata", {}).get("source", "unknown") content = result["content"] context_parts.append(f"[Source: {source}]\n{content}")
context = "\n\n---\n\n".join(context_parts)
# 3. Generate answer with context response = llm.chat.completions.create( model="vllm-primary", messages=[ { "role": "system", "content": f"""Answer questions based on the following context. Cite sources when possible.
{context}
If the answer is not in the context, say "I don't have that information in my knowledge base.""" }, {"role": "user", "content": question} ], temperature=0.3 )
return response.choices[0].message.content
# Usageanswer = rag_answer("How do I authenticate API requests?")print(answer)Choosing the Right Search Type
Section titled “Choosing the Right Search Type”| Search Type | Use When |
|---|---|
| Semantic | Questions about concepts, “how to” queries |
| Keyword | Searching for specific terms, error codes, exact phrases |
| Hybrid | Complex queries that benefit from both approaches |
| Knowledge Graph | Exploring relationships, finding connected concepts |
Best Practices
Section titled “Best Practices”Chunking
Section titled “Chunking”- Keep chunks between 256-512 tokens
- Use overlap for better context continuity
- Consider semantic chunking for natural boundaries
Retrieval
Section titled “Retrieval”- Experiment with the number of retrieved documents (3-10)
- Use metadata filtering when available
- Consider hybrid search for better recall
Prompting
Section titled “Prompting”- Clearly separate context from instructions
- Instruct the model to cite sources
- Handle cases where context doesn’t contain the answer
Performance
Section titled “Performance”- Set appropriate
min_scorethresholds to filter low-quality matches - Use
limitto control result set size - Cache frequent queries when appropriate
Related
Section titled “Related”- Agent Chat Example - Let the AI handle RAG automatically
- Semantic Search Example - Detailed semantic search patterns
- Embeddings API - Create embeddings for your documents
- Document Q&A Example - Complete RAG implementation with vector stores