Skip to content

knowledge quickstart

Get your R2R-powered knowledge base running in 5 minutes.

Terminal window
# Sign up at https://app.sciphi.ai
# Copy your API key from the dashboard
Terminal window
# Add to app/.env.local
R2R_API_URL=https://api.sciphi.ai
R2R_API_KEY=sk-your-api-key-here
Terminal window
# Start Supen
pnpm dev
# Test in another terminal
curl http://localhost:3333/api/knowledge/health
# Should return: {"available": true, ...}
Terminal window
# Upload a PDF with complex layout support
curl -X POST http://localhost:3333/api/knowledge/documents \
-F "useHiResMode=true" \
-F "workspaceId=my-workspace"
Terminal window
# RAG query (search + generate answer)
curl -X POST http://localhost:3333/api/knowledge/query \
-H "Content-Type: application/json" \
-d '{
"query": "What are the key findings?",
"workspaceId": "my-workspace",
"limit": 5
}'

Done! 🎉

Terminal window
# Start R2R locally
docker compose up r2r r2r-postgres -d
# Configure Supen to use local R2R
# app/.env.local
R2R_API_URL=http://localhost:7272

See r2r-railway-deployment.md

const response = await fetch('/api/knowledge/collections', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
name: 'Product Documentation',
description: 'All product docs',
workspaceId: 'my-workspace'
})
});
const { collection } = await response.json();
console.log('Collection ID:', collection.id);
const formData = new FormData();
formData.append('file', fileInput.files[0]);
formData.append('collectionId', collection.id);
formData.append('workspaceId', 'my-workspace');
formData.append('useHiResMode', 'true'); // For complex PDFs
const response = await fetch('/api/knowledge/documents', {
method: 'POST',
body: formData
});
const { document } = await response.json();
console.log('Document ID:', document.id);
const response = await fetch('/api/knowledge/search', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
query: 'pricing information',
workspaceId: 'my-workspace',
limit: 10,
hybridSearch: true // Semantic + keyword
})
});
const { results } = await response.json();
results.forEach(r => {
console.log(`Score: ${r.score}`);
console.log(`Text: ${r.text}`);
});
const response = await fetch('/api/knowledge/query', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
query: 'Summarize the pricing model',
workspaceId: 'my-workspace',
model: 'gpt-4o',
temperature: 0.7,
stream: true
})
});
const reader = response.body.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const text = new TextDecoder().decode(value);
console.log(text); // Stream response
}
import { getKnowledgeService } from '@/lib/r2r';
// In your agent code
const knowledgeService = getKnowledgeService();
// Retrieve relevant context
const searchResults = await knowledgeService.search(
userMessage,
{ workspaceId, limit: 5 }
);
// Add to system prompt
const contextPrompt = searchResults
.map(r => `Source: ${r.text}`)
.join('
');
const systemMessage = `
You are a helpful AI assistant with access to the following knowledge:
${contextPrompt}
Use this information to answer the user's question accurately and cite sources.
`;
// Continue with Vibex agent using contextPrompt
{
useHiResMode: true, // Use for complex PDFs
workspaceId: 'workspace-123', // Filter by workspace
collectionId: 'col-456', // Organize in collection
metadata: { // Custom metadata
author: 'John Doe',
category: 'technical'
}
}
{
workspaceId: 'workspace-123', // Filter by workspace
limit: 10, // Number of results
hybridSearch: true, // Semantic + keyword
model: 'gpt-4o', // LLM for generation
temperature: 0.7, // Generation randomness
stream: true // Stream response
}

Complex PDF Support

  • Tables, multi-column layouts
  • Embedded images
  • Scanned documents (OCR)

Hybrid Search

  • Semantic (meaning-based)
  • Keyword (exact match)
  • Fusion of both

Workspace Isolation

  • Filter documents by workspace
  • Multi-tenant support

Streaming Responses

  • Real-time answer generation
  • Better UX for long queries

Metadata Filtering

  • Custom metadata per document
  • Filter search by metadata
  1. Use hi-res mode selectively - Only for complex PDFs
  2. Set appropriate limits - Don’t retrieve more than needed
  3. Enable hybrid search - Better accuracy for most queries
  4. Cache common queries - Reduce API calls
  5. Use workspace filters - Faster searches, better relevance

Document processing stuck?

  • Check status: GET /api/knowledge/documents/:id
  • Hi-res mode takes 30-60s per document
  • Check SciPhi dashboard for processing queue

Poor search results?

  • Try hybrid search (semantic + keyword)
  • Increase limit to get more candidates
  • Check if documents are fully ingested
  • Verify workspaceId filter isn’t too restrictive

API errors?

  • Check /api/knowledge/health endpoint
  • Verify R2R_API_KEY is set correctly
  • Check SciPhi dashboard for quota limits
  • Build UI for document management (see knowledge/page.tsx)
  • Integrate with Vibex agents for automatic knowledge retrieval
  • Set up automated document ingestion pipelines
  • Add knowledge base selection in chat interface