"RAG Core Study (17/26) — Query Classification: Typing the User Question"
“What is X?” and “What is the latest approved policy?” both look like questions, but they do not want the same retrieval strategy.
One major source of instability in RAG systems is the assumption that all user questions should flow through the same retrieval path. In practice, definition questions, procedural questions, comparison questions, proper-noun questions, and time-sensitive questions each favour different signals. Query Classification is the step that turns those differences into control logic.
0. Prerequisites
- Part 16 experiment automation
- Part 12 Hybrid Search
- Part 7 metadata filters
1. Learning Objectives
- Explain why query classification matters in RAG.
- Recognise the most useful high-level query types.
- Connect query type to retrieval decisions.
- Understand why this naturally leads to rewrite and routing.
2. ํต์ฌ ์์ฝ
Query Classification is the process of identifying the type of question before retrieval. Definition questions often benefit from stronger semantic retrieval. Proper-noun questions benefit from exact-match signals. Time-sensitive questions depend heavily on version filters. Comparison and procedure questions often need broader context and more than one supporting document. Classification is therefore not a reporting label. It is a retrieval control signal.
3. Intuition — Similar Surface, Different Retrieval Need
Consider these questions:
- “What is RAGAS?” -> definition
- “What is the exception approval procedure?” -> procedure
- “What is the difference between Qdrant and Weaviate?” -> comparison
- “What does PR-2024-Q3 conclude?” -> proper noun
- “What is the latest approved policy?” -> time-sensitive
If you send all five through one fixed strategy, some will predictably underperform.
4. Definitions — Common Query Types
| Type | Typical trait | Retrieval priority |
|---|---|---|
| Definition | concept explanation | semantic retrieval |
| Procedure | ordered steps | structured, longer context |
| Comparison | contrast between items | multiple relevant documents |
| Proper noun | code, report ID, person, product name | exact match / sparse |
| Time-sensitive | latest, current, valid as of now | metadata and version filters |
| Permission-sensitive | answer depends on role or access scope | security-aware filters |
5. Mechanism — What the Classification Result Actually Controls
A query-type label can influence:
- whether query rewrite is needed
- Dense vs BM25 weighting
- metadata filter selection
- collection or retriever routing
- top-K depth
This is why classification belongs before retrieval rather than after it.
6. Walkthrough — Starting With a Simple Rule-based Classifier
6.1 Minimal classifier
def classify_query(query: str) -> str:
if "latest" in query or "current" in query:
return "time_sensitive"
if "difference" in query or "compare" in query:
return "comparison"
if any(token in query for token in ["PR-", "report", "policy code", "SKU-"]):
return "proper_noun"
if "procedure" in query or "how do I" in query:
return "procedure"
return "definition"
6.2 Connecting type to retrieval strategy
query_type = classify_query(query)
if query_type == "proper_noun":
alpha = 0.2 # favour sparse
elif query_type == "definition":
alpha = 0.7 # favour dense
elif query_type == "time_sensitive":
filters["version"] = "latest"
Even a basic rule-based classifier can create meaningful gains if it routes obvious cases correctly.
Self-explanation: Why should query classification happen before retrieval instead of after?
7. Variants and Use Cases
7.1 Rule-based classification
What changes
A small set of explicit patterns assigns a query type.
Why it matters
It is fast, interpretable, and easy to deploy.
What it enables
You can start adjusting weighting, filters, and routing without training a new model.
Limit and next step
It breaks under varied phrasing, which is why teams later move to learned classifiers or LLM-based classifiers.
7.2 LLM-based classification
An LLM can map a query onto a predefined label set with better linguistic flexibility, but at the cost of latency and consistency management.
7.3 Multi-label classification
Real questions are often mixed: latest policy comparison is both time-sensitive and comparative. Multi-label setups can represent that better than single-label systems.
8. Limits and Failure Modes
8.1 Too many labels become hard to maintain
If the type taxonomy grows without discipline, the pipeline becomes hard to reason about.
8.2 Misclassification can damage downstream retrieval
If a proper-noun query is treated like a pure semantic definition query, the system may lose exact-match recall.
8.3 Query types drift with domain and product changes
A taxonomy that worked for internal policy search may not work well for product docs or legal retrieval.
8.4 Next step — Once the type is known, the query itself can be rewritten
Classification makes the next move clearer: now the system can rewrite the query differently depending on what kind of retrieval problem it is solving. That is Part 18.
8.5 Common Pitfalls
| # | Pitfall | Symptom | Fast Check |
|---|---|---|---|
| 1 | one strategy for all queries | unstable type-specific performance | evaluate by query type |
| 2 | too many labels | unmaintainable policy logic | begin with a compact taxonomy |
| 3 | not logging query type | poor debugging | store query_type in traces |
| 4 | vague label definitions | inconsistent annotation | define examples per label |
| 5 | classification not connected to retrieval | little impact | wire labels into routing and weights |
9. Self-check — Answer Before Looking
Q1. What is the purpose of query classification in RAG?
Answer To let the system adapt retrieval strategy to the kind of question being asked.
Why Different question types need different retrieval signals.
Q2. Why do proper-noun questions often need sparse retrieval?
Answer Because exact identifiers are often better captured by exact-token matching.
Why Dense retrieval may blur rare IDs and codes.
Q3. Why are time-sensitive questions special?
Answer Because the correct answer may depend more on version and recency than semantic similarity.
Why A semantically relevant but outdated policy can still be the wrong answer.
Q4. Why is even a simple rule-based classifier often useful?
Answer Because many high-value query types are obvious enough to catch with simple patterns.
Why Routing, filters, and weighting can improve from those distinctions immediately.
Cheat Sheet — One-page Summary
Definitions - Definition query: concept explanation - Proper-noun query: code or entity lookup - Time-sensitive query: recency-dependent question
Minimal code
if "latest" in query:
query_type = "time_sensitive"
When to use what | Query type | Retrieval default | |---|---| | definition | more dense | | proper noun | more sparse | | comparison | broader retrieval | | time-sensitive | version filters first |
References
Supporting notes
- User notes, chapter 14 query classification
- User notes, chapter 15 rewrite and routing
Bridge to the Next Part
Once the system knows what kind of question it is facing, the next step is often to rewrite that question into a retrieval-friendly form. Part 18 covers query rewrite and expansion.
๋๊ธ
๋๊ธ ์ฐ๊ธฐ