"RAG Core Study (19/26) — Query Routing: Multi-Retriever and Collection Routing"

Even a well-written query fails if it is sent to the wrong document space or the wrong retrieval engine.

Query Routing is the layer that decides where and how retrieval should happen. It can select a document collection, a retriever type, a filter set, or a fallback path. Part 19 explains why routing matters, how it differs from query classification, and why filter-first routing often matters more than model choice.


0. Prerequisites

  • Part 17 query classification
  • Part 18 query rewrite
  • Part 3 ingestion design, especially filter-first retrieval

1. Learning Objectives

  1. Distinguish collection routing from retriever routing.
  2. Understand why routing belongs before retrieval.
  3. Learn the role of fallback paths in production RAG.
  4. See why source authority often matters as much as similarity.

2. ํ•ต์‹ฌ ์š”์•ฝ

Collection routing chooses which document group should be searched. Retriever routing chooses which retrieval method should be used there. In production systems, routing often follows a sequence like: classify -> select collection -> apply filters -> choose retriever -> fallback if needed. Without routing, the system searches too large a space too blindly, wasting latency and lowering precision.


3. Intuition — The Right Answer Can Live in the Wrong Place

Question: “Who approves the external-sharing exception?”

Possible sources:

  • official policy documents
  • training material
  • meeting notes
  • legal commentary

The answer may appear in several places, but those places do not have equal authority. Routing encodes that preference before retrieval starts.


4. Definitions — Core Routing Terms

Term Definition
Collection Routing Selecting the document set or index to search
Retriever Routing Selecting Dense, Sparse, Hybrid, or another retrieval mode
Filter-first Retrieval Narrow the searchable space with filters before similarity search
Fallback Route Secondary search path used when the first route is uncertain or weak
Source Authority Preference ordering over document sources by trust level

5. Mechanism — Routing as a Pre-retrieval Decision Layer

Routing typically decides:

  1. which collection is valid
  2. which filters are mandatory
  3. which retriever signal should dominate
  4. which fallback path exists if confidence is low

This is why routing is not a post-hoc explanation layer. It directly shapes the search space itself.


6. Walkthrough — A Minimal Rule-based Router

6.1 Collection routing

def choose_collection(query_type, query):
    if query_type == "time_sensitive":
        return "latest_policies"
    if "meeting" in query:
        return "meeting_notes"
    if query_type == "proper_noun":
        return "official_docs"
    return "general_knowledge"

6.2 Retriever routing

def choose_retriever(query_type):
    if query_type == "proper_noun":
        return "sparse"
    if query_type == "comparison":
        return "hybrid"
    return "dense"

6.3 Fallback logic

result = run_primary_route(query)
if result.confidence < 0.45:
    result = run_fallback_route(query, retriever="hybrid", collection="general_knowledge")

Self-explanation: Why is routing more than just a convenience feature in RAG?


7. Variants and Use Cases

7.1 Source-based routing

What changes
The system prefers one source family before another.

Why it matters
Many retrieval failures are really source-selection failures rather than embedding failures.

What it enables
You can prioritise authoritative documents before looser commentary.

Limit and next step
Rigid source rules may hide relevant exceptions, which is why fallback routes still matter.

7.2 Multi-retriever routing

Dense for semantic questions, Sparse for identifier-heavy questions, Hybrid for mixed questions. This is one of the most practical routing patterns in production RAG.

7.3 Learned routers

Once enough logs exist, a lightweight classifier or LLM can decide route choice. This increases flexibility but also operational opacity.


8. Limits and Failure Modes

8.1 Bad routing can ruin good retrievers

Searching the wrong collection means retrieval quality collapses even if the search algorithm itself is strong.

8.2 Routing logic can become too complex

Excessive exceptions and route conditions make the system brittle and hard to debug.

8.3 No fallback means no recovery path

If the first route fails and no secondary route exists, a single routing mistake becomes a full answer failure.

8.4 Next step — Routing still leaves the weighting question open

Once a route is chosen, the next design problem is often how much to trust Dense versus Sparse for that query. That is Part 20.


8.5 Common Pitfalls

# Pitfall Symptom Fast Check
1 one giant search space low precision separate collections earlier
2 no fallback route brittle failures define a secondary route
3 ignoring source authority commentary outranks policy rank sources by trust
4 not logging route choices weak observability record route decisions in traces
5 too many route exceptions difficult maintenance simplify the route policy

9. Self-check — Answer Before Looking

Q1. What does collection routing decide?

Answer It decides which document group or index should be searched.
Why The right answer often depends on searching the right source space first.

Q2. What does retriever routing decide?

Answer It decides whether Dense, Sparse, Hybrid, or another retrieval mode should be used.
Why Different question types benefit from different retrieval signals.

Q3. Why is fallback important?

Answer Because routing decisions can be wrong, and the system needs a recovery path.
Why A single bad route should not automatically become a final answer failure.


Cheat Sheet — One-page Summary

Definitions - Collection Routing: choose the searchable source space - Retriever Routing: choose the retrieval method - Fallback Route: backup retrieval path

Minimal code

collection = choose_collection(query_type, query)
retriever = choose_retriever(query_type)

When to use what | Situation | Route default | |---|---| | proper noun lookup | official docs + sparse | | semantic definition | dense | | mixed evidence | hybrid | | recency-sensitive question | latest collection + filters |


References

Supporting notes

  • User notes, section 35-6 filter-first retrieval
  • User notes, chapter 16 query routing

Bridge to the Next Part

Routing decides the path, but a Hybrid route still needs one more choice: how much Dense and how much Sparse should count? Part 20 covers dynamic weighting.

๋Œ“๊ธ€

์ด ๋ธ”๋กœ๊ทธ์˜ ์ธ๊ธฐ ๊ฒŒ์‹œ๋ฌผ

Agent Memory Engine (2/10) — Building an AI Agent Memory System with SQLite Alone

"ML Foundations (9/9) — PyTorch vs TensorFlow, and the Road to Local LLMs"

"RAG Core Study (14/26) — Evaluation Sets with RAGAS & DeepEval"

"ML Foundations (8/9) — Deep Learning Architectures: CNN, RNN, Attention"

"ML Foundations (7/9) — Deep Learning Training: Optimizers, Regularization, Initialization"

OpenClaw to Hermes Migration (2/13) — What to Preserve, Partially Port, or Discard

AI Agents I Built (5/7) — Building an Automated Blogger API Publishing System