How large can the knowledge base be?

There is no strict size limit. The system scales by indexing documents into a vector store. However, very large knowledge bases may require tuning retrieval parameters to maintain speed and relevance.

Can I use RAG with multiple agents in a network?

Yes. Each agent can have its own knowledge base, or multiple agents can share a single knowledge base. Sharing is useful when agents need consistent information across different domains.

Does RAG work with non-English documents?

Yes. The embedding models support multiple languages. Upload documents in any supported language and the retrieval system will match queries to relevant chunks regardless of language.

Advanced

Memory Management: RAG and Context Windows

Configure retrieval-augmented generation and context windowing so your OpenClaw agent remembers what matters without wasting tokens.

Deploy OpenClaw See How It Works

What You Will Get

By the end of this guide, your OpenClaw agent will use retrieval-augmented generation (RAG) to pull relevant information from a knowledge base and context window strategies to manage conversation history efficiently. Your agent will answer questions accurately by referencing stored documents while staying within token limits.

RAG solves the fundamental limitation of language models: their context window has a finite size. Instead of cramming everything into the prompt, you store knowledge in a searchable vector database and retrieve only the relevant pieces when needed. This keeps costs low and answers precise.

You will set up a knowledge base, configure embedding and retrieval, tune the context window size, and test the full pipeline. The result is an agent that draws on a large body of knowledge without ever exceeding its token budget.

Step-by-Step Setup

Follow these steps to configure RAG and context management.

Upload Documents to the Knowledge Base

Open your agent's Knowledge Base tab in the RunTheAgent dashboard. Upload the documents your agent should reference, such as product manuals, FAQs, or internal guides. Supported formats include PDF, Markdown, plain text, and HTML. The system chunks and indexes each document automatically.

Configure Embedding Settings

Choose the embedding model used to convert documents into vectors. The default model works well for most use cases. Adjust the chunk size and overlap settings based on your content. Shorter chunks improve retrieval precision, while longer chunks preserve more context per result.

Set Retrieval Parameters

Configure how many chunks the agent retrieves per query. Start with three to five chunks and adjust based on answer quality. Set a minimum similarity threshold to filter out irrelevant results. Chunks below the threshold are excluded from the agent's context.

Configure the Context Window Strategy

Decide how conversation history is managed alongside retrieved knowledge. You can use a sliding window that keeps the last N messages, a summary mode that compresses older messages, or a hybrid approach. Each strategy trades off between context richness and token efficiency.

Write RAG-Aware System Prompts

Update your agent's system prompt to instruct it on how to use retrieved context. For example, 'Use the provided reference documents to answer questions. If the documents do not contain relevant information, say so rather than guessing.' This prevents the agent from hallucinating when the knowledge base has gaps.

Test Retrieval Accuracy

Ask your agent questions that require information from the knowledge base. Check the logs to see which chunks were retrieved and whether they were relevant. If the wrong chunks are returned, adjust your chunk size, overlap, or similarity threshold.

Monitor Token Usage

Track token consumption on the analytics panel. Compare usage before and after enabling RAG to quantify the savings. If usage is still too high, reduce the number of retrieved chunks or switch to a more aggressive context compression strategy.

Tips and Best Practices

Keep Documents Up to Date

Stale documents lead to outdated answers. Set a reminder to review and update your knowledge base at least monthly. Delete obsolete documents to prevent the agent from surfacing old information.

Use Metadata Filters

Tag documents with metadata like category, date, or department. Then configure retrieval to filter by metadata before ranking by similarity. This narrows the search space and improves accuracy.

Test Edge Cases

Ask questions that span multiple documents or that the knowledge base cannot answer. Verify that the agent handles these gracefully by combining chunks from different sources or admitting it does not have the information.

Balance Chunk Size and Retrieval Count

Smaller chunks with more retrievals give fine-grained answers. Larger chunks with fewer retrievals preserve broader context. Experiment with both approaches to find the sweet spot for your content type.

Frequently Asked Questions

Memory Issues Cleanup Advanced Prompt Engineering Model Switching Strategies

Ready to get started?

Deploy your own OpenClaw instance in under 60 seconds. No VPS, no Docker, no SSH. Just your personal AI assistant, ready to work.

Deploy OpenClaw View Pricing

Starting at $24.50/mo. Everything included. 3-day money-back guarantee.