Concepts
Architecture
Semantic Router: Technical Architecture
System Overview
Semantic Router is built around three core components that work together to enable intelligent routing of inputs based on semantic meaning.
Core Components
1. Encoders
Encoders transform inputs into vector representations in semantic space.
Types of Encoders:
- Dense encoders: Generate continuous vectors (OpenAI, HuggingFace, etc.)
- Sparse encoders: Generate sparse vectors (BM25, TFIDF, AurelioSparse, etc.)
- Multimodal encoders: Handle images and text (CLIP, ViT)
2. Routes
Routes define patterns to match against, with examples of inputs that should trigger them.
Key properties:
- name: Identifier for the route
- utterances: Example inputs that should match this route
- function_schemas: Optional specifications for function calling
- score_threshold: Minimum similarity score required to match
3. Indexing Systems
Indexes store and retrieve route vectors efficiently.
Index types:
- LocalIndex: In-memory vector storage for dense embeddings
- HybridLocalIndex: In-memory storage supporting both dense and sparse vectors
- PineconeIndex/QdrantIndex: Cloud-based vector DBs
- PostgresIndex: SQL-based vector storage
Data Flow
- Input Reception: The system receives an input (text, image)
- Encoding: The input is transformed into a vector representation
- Retrieval: The vector is compared against stored route vectors
- Matching: The best matching route is selected based on similarity
- Response: The system returns the matched route, enabling appropriate handling
Router Types
- SemanticRouter: Uses dense vector embeddings for semantic matching
- HybridRouter: Combines both dense and sparse vectors for enhanced accuracy
Integration Example
Performance Considerations
- In-memory vs. Vector DB: Choose based on scale and latency requirements
- Encoder selection: Balance accuracy vs. speed based on use case
- Batch processing: Use batch methods for higher throughput
- Async support: Available for high-concurrency environments and applications relying on heavy network use