RAG in Practice: Building an Enterprise Knowledge Base with LangChain + Vector Database

2026/06/04·2 min read·129 views

RAG Practice：Build enterprise knowledge base

The biggest problem with LLMs is「hallucination」——it will confidently fabricate answers。RAG（Retrieval-Augmented Generation）by having AI first retrieve relevant documents, then answer，perfectly solves this problem。

What is RAG？

RAG = Retrieval Augmented Generation，Retrieval-Augmented Generation。Core concept：

Split enterprise documents into small chunks
use Embedding Model converts text to vectors
Store in vector database
When user asks a question，First retrieve relevant documents
documents + send questions toLLM

Tech stack

Component	Tech stack selection	Description
Framework	LangChain	AI Application framework
vector database	Chroma / Pinecone	Store document vectors
Embedding	text-embedding-3-small	Text to vector
LLM	DeepSeek V4	Generate answer
API relay	Ciyuano Ciyuano	Unified API Entry

Complete code

1. Install dependencies

pip install langchain langchain-openai chromadb tiktoken

2. Load documents

from langchain_community.document_loaders import (

DirectoryLoader,

TextLoader,

PyPDFLoader

)

from langchain.text_splitter import RecursiveCharacterTextSplitter

Load docs all files in the directory

loader = DirectoryLoader("./docs", glob="*/.md")

documents = loader.load()

Split document

splitter = RecursiveCharacterTextSplitter(

chunk_size=500,

chunk_overlap=50,

separators=["\n\n", "\n", "。", "！", "？"]

)

chunks = splitter.split_documents(documents)

print(f"Split into {len(chunks)} document chunks")

3. Create vector database

from langchain_openai import OpenAIEmbeddings

from langchain_community.vectorstores import Chroma

embeddings = OpenAIEmbeddings(

base_url="https://www.ciyuano.com/v1",

api_key="sk-relay-Your API key",

model="text-embedding-3-small"

)

vectorstore = Chroma.from_documents(

chunks,

embeddings,

persist_directory="./chroma_db"

)

print("Vector database created！")

4. Build Q&A chain

from langchain.chains import RetrievalQA

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(

model="deepseek-v4-flash",

base_url="https://www.ciyuano.com/v1",

api_key="sk-relay-Your API key",

temperature=0

)

qa_chain = RetrievalQA.from_chain_type(

llm=llm,

retriever=vectorstore.as_retriever(

search_kwargs={"k": 3}

),

return_source_documents=True

)

Query

result = qa_chain.invoke({"query": "What is the company's leave policy？"})

print("Answer：", result["result"])

print("Source：", [doc.metadata["source"] for doc in result["source_documents"]])

Optimization tips

1. Splitting strategy

Choose splitting method based on document type：

Markdown：Split by heading
Code：By function/class-based splitting
PDF：Split by paragraph

2. Retrieval optimization

Use hybrid retrieval（vector + keywords）
Adjust top_k count
Use reranking model

3. Prompt Optimize

in Prompt explicitly requires：

Answer only based on provided documents
If no relevant information in documents，explicitly state「I'm not sure」
Cite document sources

Summary

RAG is an enterprise AI best practices for applications。through LangChain + vector database，you can quickly build an accurate、reliable enterprise knowledge base system。

A Practical Guide to AI Programming Assistants: Code Generation, Bug Fixing, and Code Review All in One

A tutorial on using AI programming assistants for beginners, covering four key scenarios: code generation, bug fixing, code explanation, and code review, along with tips for effective prompting and tool recommendations.

Tech Frontier

AI Note-Taking Assistant: Efficiently Organize Notes, Summarize Articles, and Manage Knowledge with AI

Bookmarks collecting dust, notes scattered, no time for long reads? This guide shows you how to use AI to organize your materials in four steps—summarizing articles, consolidating notes, building a knowledge base, and cleaning up bookmarks—with ready-to-use prompt templates.