"RAG Core Study (25/26) — Security, Permissions, and Re-indexing Operations"
In production RAG, a spectacular answer is still a failure if it came from a document the user should never have seen.
Most RAG study paths spend much more time on embeddings than on permissions. Production systems cannot afford that imbalance. Security, source control, versioning, and re-indexing are not side concerns. They are structural retrieval concerns. Part 25 explains why permissions must be enforced before retrieval, not after it, and why stale indexes are operationally dangerous.
0. Prerequisites
- Part 7 metadata design
- Part 19 query routing
- Parts 23 and 24, where retrieval becomes more structurally complex
1. Learning Objectives
- Explain why security must be embedded into retrieval rather than added afterward.
- Distinguish RBAC, ABAC, and row-level filtering.
- Understand why re-indexing and versioning matter operationally.
- Recognise the role of audit logs in production RAG.
2. ํต์ฌ ์์ฝ
The core operational rule is filter-first security. The retriever should search only within the set of documents the user is allowed to access. RBAC handles role-based access, ABAC handles attribute-based access, and row-level retrieval filters can enforce document- or chunk-level permissions. At the same time, stale indexes can quietly produce outdated or revoked information, so re-indexing and version tracking are part of the same operational discipline.
3. Intuition — Why Post-filter Security Is Too Late
If an unauthorised document already entered the candidate set, then:
- it may appear in reranker inputs
- it may be stored in traces
- its contents may influence answer generation
That is why permission checks should not be an afterthought. They should shape the searchable space from the start.
4. Definitions — Core Operational Terms
| Term | Definition |
|---|---|
| RBAC | role-based access control |
| ABAC | attribute-based access control |
| Row-level Security | applying access control at document or chunk level |
| Re-indexing | rebuilding or updating embeddings and indices after source changes |
| Versioning | tracking which document or index version was used |
| Audit Log | trace of who searched what and what evidence was returned |
5. Mechanism — Security Belongs Inside the Retrieval Pipeline
The usual order should be:
- identify the user and permission scope
- restrict allowed collections and namespaces
- apply metadata filters
- run retrieval and reranking
- log the event for auditability
Security is therefore a retrieval design decision, not only an application-layer feature.
6. Walkthrough — Security Filters and Re-index Control
6.1 Permission-aware filters
filters = {
"security_level": {"$in": user.allowed_levels},
"department": {"$in": user.allowed_departments},
"version_status": "active",
}
hits = vector_store.search(query, filter=filters, k=10)
6.2 Re-index trigger
if document.updated_at > index.last_built_at:
enqueue_reindex(document.id)
6.3 Audit logging
log_event({
"user_id": user.id,
"query": query,
"retrieved_doc_ids": [doc.id for doc in hits],
"index_version": current_index_version,
})
Self-explanation: Why is pre-retrieval filtering safer than post-retrieval filtering?
7. Variants and Use Cases
7.1 RBAC-driven retrieval
What changes
Collections or filters depend mainly on the user’s role.
Why it matters
It is easy to reason about when organisational boundaries are stable.
What it enables
Fast and understandable permission-aware retrieval.
Limit and next step
Complex organisations often need finer-grained attribute logic.
7.2 ABAC-driven retrieval
Attributes such as region, project code, contract status, or classification level may all influence access.
7.3 Version-aware retrieval
Retrieval may default to active documents while still allowing explicit historical lookup when authorised and requested.
8. Limits and Failure Modes
8.1 Permission tags can be wrong
If metadata is missing or inaccurate, the retrieval filter inherits the same mistake.
8.2 Delayed re-indexing creates stale answers
Updated source documents do not help if the index still reflects older content.
8.3 No audit trail means poor incident recovery
Without query and evidence logs, quality failures and security incidents become difficult to investigate.
8.4 Next step — The final step is to integrate all of these ideas into one project
The series ends not with another isolated concept, but with a small capstone that forces the ideas to work together. That is Part 26.
8.5 Common Pitfalls
| # | Pitfall | Symptom | Fast Check |
|---|---|---|---|
| 1 | post-filter security | unauthorised candidates enter the pipeline | enforce filter-first policy |
| 2 | stale index management | outdated answers | monitor re-index lag |
| 3 | weak metadata quality | broken permission filters | validate metadata pipelines |
| 4 | no audit log | poor incident analysis | log user, query, and evidence IDs |
| 5 | no active/archive distinction | old policies outrank new ones | track status and version fields |
9. Self-check — Answer Before Looking
Q1. What is the first rule of secure RAG retrieval?
Answer Retrieve only from documents the user is authorised to access.
Why Once unauthorised documents enter the pipeline, damage may already be done.
Q2. Why does re-indexing matter for correctness?
Answer Because stale embeddings and stale indices can keep retrieving outdated evidence.
Why Source updates do not automatically propagate into retrieval quality.
Q3. Why are audit logs important?
Answer They let the team trace what was retrieved, for whom, and under which index version.
Why That is critical for both quality debugging and security investigation.
Cheat Sheet — One-page Summary
Definitions - RBAC: role-based access control - ABAC: attribute-based access control - Re-indexing: updating retrieval structures after source change
Minimal code
hits = vector_store.search(query, filter=security_filters, k=10)
When to use what | Situation | Control style | |---|---| | simple organisation | RBAC | | complex entitlement logic | ABAC | | recency-sensitive retrieval | version-aware filtering |
References
Supporting notes
- User notes, chapter 22 security and operations
Bridge to the Next Part
The last step is to assemble the series into one small, coherent project. Part 26 is the personal-documents RAG capstone and roadmap.
๋๊ธ
๋๊ธ ์ฐ๊ธฐ