RAG in Practice: Building an Enterprise Knowledge Base with LangChain + Vector Database
RAG Practice:Build enterprise knowledge base
The biggest problem with LLMs is「hallucination」——it will confidently fabricate answers。RAG(Retrieval-Augmented Generation)by having AI first retrieve relevant documents, then answer,perfectly solves this problem。
What is RAG?
RAG = Retrieval Augmented Generation,Retrieval-Augmented Generation。Core concept:
- Split enterprise documents into small chunks
- use Embedding Model converts text to vectors
- Store in vector database
- When user asks a question,First retrieve relevant documents
- documents + send questions toLLM
Tech stack
| Component | Tech stack selection | Description |
|---|---|---|
| Framework | LangChain | AI Application framework |
| vector database | Chroma / Pinecone | Store document vectors |
| Embedding | text-embedding-3-small | Text to vector |
| LLM | DeepSeek V4 | Generate answer |
| API relay | Ciyuano Ciyuano | Unified API Entry |
Complete code
1. Install dependencies
pip install langchain langchain-openai chromadb tiktoken
2. Load documents
from langchain_community.document_loaders import (
DirectoryLoader,
TextLoader,
PyPDFLoader
)
from langchain.text_splitter import RecursiveCharacterTextSplitter
Load docs all files in the directory
loader = DirectoryLoader("./docs", glob="*/.md")
documents = loader.load()
Split document
splitter = RecursiveCharacterTextSplitter(
chunk_size=500,
chunk_overlap=50,
separators=["\n\n", "\n", "。", "!", "?"]
)
chunks = splitter.split_documents(documents)
print(f"Split into {len(chunks)} document chunks")
3. Create vector database
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
embeddings = OpenAIEmbeddings(
base_url="https://www.ciyuano.com/v1",
api_key="sk-relay-Your API key",
model="text-embedding-3-small"
)
vectorstore = Chroma.from_documents(
chunks,
embeddings,
persist_directory="./chroma_db"
)
print("Vector database created!")
4. Build Q&A chain
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="deepseek-v4-flash",
base_url="https://www.ciyuano.com/v1",
api_key="sk-relay-Your API key",
temperature=0
)
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
retriever=vectorstore.as_retriever(
search_kwargs={"k": 3}
),
return_source_documents=True
)
Query
result = qa_chain.invoke({"query": "What is the company's leave policy?"})
print("Answer:", result["result"])
print("Source:", [doc.metadata["source"] for doc in result["source_documents"]])
Optimization tips
1. Splitting strategy
Choose splitting method based on document type:
- Markdown:Split by heading
- Code:By function/class-based splitting
- PDF:Split by paragraph
2. Retrieval optimization
- Use hybrid retrieval(vector + keywords)
- Adjust top_k count
- Use reranking model
3. Prompt Optimize
in Prompt explicitly requires:
- Answer only based on provided documents
- If no relevant information in documents,explicitly state「I'm not sure」
- Cite document sources
Summary
RAG is an enterprise AI best practices for applications。through LangChain + vector database,you can quickly build an accurate、reliable enterprise knowledge base system。
📖 Related Articles
DeepSeek V4: A Milestone in Domestic Large Models
DeepSeek V4 is the latest flagship large model launched by the DeepSeek team, achieving significant improvements in multiple dimensions such as reasoning, coding, and Chinese. This article will comprehensively analyze its core capabilities, performance, and integration methods.
Tech Frontier2025 Domestic Large Model Panoramic Comparison: DeepSeek vs GLM vs Qwen
Objectively compare DeepSeek V4, GLM-5, and Qwen-Plus across four dimensions: reasoning ability, Chinese comprehension, code generation, and price.
Tech FrontierRAG System Design: Empowering Large Models with Real-time Knowledge Capabilities
In-depth explanation of the design principles and implementation of RAG (Retrieval-Augmented Generation) systems, including vector retrieval, hybrid search, re-ranking, and evaluation framework.
💬 Comments are not yet available, stay tuned