大模型增强检索生成(RAG)教程：基于Spring AI与Java实现-技术汇

1. 什么是RAG

增强检索生成(Retrieval-Augmented Generation, RAG)是一种结合信息检索与生成式AI的技术，它能让大模型在回答问题时参考外部知识源，从而：

提高回答的准确性和可靠性
减少幻觉(生成错误信息)
使模型能够获取最新信息
针对特定领域知识进行回答

RAG的基本流程：

将文档数据处理为可检索的格式
接收用户查询
从文档中检索与查询相关的信息
将检索到的信息与查询一起发送给大模型
大模型基于检索到的信息生成回答

2. RAG的核心组件

一个完整的RAG系统包含以下核心组件：

文档加载器(Document Loader)：加载各种格式的文档(文本、PDF、HTML等)
文档分割器(Document Splitter)：将文档分割为适合处理的小块(Chunk)
嵌入模型(Embedding Model)：将文本转换为向量表示
向量存储(Vector Store)：存储向量并支持相似性搜索
检索器(Retriever)：根据查询找到相关的文档片段
提示词模板(Prompt Template)：格式化查询和检索到的信息
大模型(Large Language Model)：基于提供的信息生成回答

3. 环境准备

3.1 必要工具

JDK 17或更高版本
Maven 3.6+或Gradle 7.0+
IDE(推荐IntelliJ IDEA)
Docker(用于运行向量数据库)

3.2 依赖服务

向量数据库(本文使用Milvus，也可使用Pinecone、Weaviate等)
大模型服务(可使用OpenAI API、Azure OpenAI、本地部署的模型等)
嵌入模型(可使用OpenAI Embeddings、HuggingFace模型等)

3.3 启动Milvus向量数据库

# 使用Docker启动Milvus standalone
docker run -d --name milvus-standalone \
  -p 19530:19530 \
  -p 9091:9091 \
  -e "ETCD_ENDPOINTS=http://etcd:2379" \
  -e "MINIO_ADDRESS=minio:9000" \
  milvusdb/milvus:v2.3.4 standalone

4. 项目初始化

4.1 创建Spring Boot项目

使用Spring Initializr创建项目，添加以下依赖：

Spring Web
Spring AI

4.2 Maven依赖配置(pom.xml)

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>3.2.0</version>
        <relativePath/>
    </parent>
    
    <groupId>com.example</groupId>
    <artifactId>spring-ai-rag-demo</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>spring-ai-rag-demo</name>
    
    <properties>
        <java.version>17</java.version>
        <spring-ai.version>0.8.1</spring-ai.version>
    </properties>
    
    <dependencies>
        <!-- Spring Web -->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        
        <!-- Spring AI Core -->
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-core</artifactId>
        </dependency>
        
        <!-- Spring AI OpenAI -->
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-openai</artifactId>
        </dependency>
        
        <!-- Spring AI Vector Stores - Milvus -->
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-milvus</artifactId>
        </dependency>
        
        <!-- Document Loaders -->
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-pdf-document-reader</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-tika-document-reader</artifactId>
        </dependency>
        
        <!-- Testing -->
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>
    
    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.springframework.ai</groupId>
                <artifactId>spring-ai-bom</artifactId>
                <version>${spring-ai.version}</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>
    
    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>
</project>

4.3 配置文件(application.yml)

spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY:your-openai-api-key}
      base-url: https://api.openai.com/v1
      embedding:
        model: text-embedding-ada-002
      chat:
        model: gpt-3.5-turbo
        temperature: 0.7
    milvus:
      uri: http://localhost:19530
      database-name: default
      collection-name: rag_demo_collection
      dimension: 1536  # 与text-embedding-ada-002的维度匹配

# 应用配置
app:
  document:
    directory: ${DOCUMENT_DIRECTORY:documents}  # 文档存放目录

5. 核心功能实现

5.1 文档加载器

文档加载器负责读取各种格式的文档，Spring AI提供了多种文档加载器：

package com.example.rag.service;

import org.springframework.ai.document.Document;
import org.springframework.ai.reader.pdf.PdfDocumentReader;
import org.springframework.ai.reader.tika.TikaDocumentReader;
import org.springframework.core.io.FileSystemResource;
import org.springframework.stereotype.Service;

import java.io.File;
import java.util.List;
import java.util.stream.Collectors;

@Service
public class DocumentLoaderService {

    /**
     * 加载指定目录下的所有文档
     */
    public List<Document> loadDocuments(String directoryPath) {
        File directory = new File(directoryPath);
        if (!directory.exists() || !directory.isDirectory()) {
            throw new IllegalArgumentException("Invalid document directory: " + directoryPath);
        }

        return List.of(directory.listFiles()).stream()
                .filter(file -> !file.isDirectory())
                .flatMap(file -> loadDocument(file).stream())
                .collect(Collectors.toList());
    }

    /**
     * 根据文件类型加载单个文档
     */
    private List<Document> loadDocument(File file) {
        String fileName = file.getName().toLowerCase();
        FileSystemResource resource = new FileSystemResource(file);

        if (fileName.endsWith(".pdf")) {
            // 加载PDF文档
            PdfDocumentReader pdfReader = new PdfDocumentReader(resource);
            return pdfReader.get();
        } else {
            // 使用Tika加载其他格式(文本、Word、HTML等)
            TikaDocumentReader tikaReader = new TikaDocumentReader(resource);
            return tikaReader.get();
        }
    }
}

5.2 文档分割器

将长文档分割为较小的块，便于后续处理：

package com.example.rag.service;

import org.springframework.ai.document.Document;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.stereotype.Service;

import java.util.List;

@Service
public class DocumentSplitterService {

    // 配置分割器：按标记分割， chunk大小500，重叠50
    private final TokenTextSplitter textSplitter = new TokenTextSplitter(500, 50);

    /**
     * 分割文档为小块
     */
    public List<Document> splitDocuments(List<Document> documents) {
        return textSplitter.split(documents);
    }
}

5.3 嵌入模型

将文本转换为向量表示：

package com.example.rag.service;

import org.springframework.ai.embedding.EmbeddingClient;
import org.springframework.ai.embedding.EmbeddingRequest;
import org.springframework.ai.embedding.EmbeddingResponse;
import org.springframework.stereotype.Service;

import java.util.List;

@Service
public class EmbeddingService {

    private final EmbeddingClient embeddingClient;

    public EmbeddingService(EmbeddingClient embeddingClient) {
        this.embeddingClient = embeddingClient;
    }

    /**
     * 将文本列表转换为向量
     */
    public List<List<Float>> embed(List<String> texts) {
        EmbeddingRequest request = EmbeddingRequest.builder()
                .inputs(texts)
                .build();
                
        EmbeddingResponse response = embeddingClient.embed(request);
        return response.getEmbeddings().stream()
                .map(embedding -> embedding.getEmbedding())
                .collect(java.util.stream.Collectors.toList());
    }
    
    /**
     * 将单个文本转换为向量
     */
    public List<Float> embed(String text) {
        return embed(List.of(text)).get(0);
    }
}

5.4 向量存储

存储和检索向量数据：

package com.example.rag.service;

import org.springframework.ai.document.Document;
import org.springframework.ai.milvus.MilvusVectorStore;
import org.springframework.stereotype.Service;

import java.util.List;

@Service
public class VectorStoreService {

    private final MilvusVectorStore vectorStore;

    public VectorStoreService(MilvusVectorStore vectorStore) {
        this.vectorStore = vectorStore;
    }

    /**
     * 将文档存储到向量数据库
     */
    public void addDocuments(List<Document> documents) {
        vectorStore.add(documents);
    }

    /**
     * 根据查询向量检索相似文档
     */
    public List<Document> search(List<Float> queryEmbedding, int topK) {
        return vectorStore.similaritySearch(queryEmbedding, topK);
    }

    /**
     * 根据查询文本检索相似文档
     */
    public List<Document> search(String query, int topK) {
        return vectorStore.similaritySearch(query, topK);
    }

    /**
     * 清除向量数据库中的所有数据
     */
    public void clear() {
        vectorStore.deleteAll();
    }
}

5.5 检索器

封装检索逻辑：

package com.example.rag.service;

import org.springframework.ai.document.Document;
import org.springframework.stereotype.Service;

import java.util.List;

@Service
public class RetrieverService {

    private final VectorStoreService vectorStoreService;

    public RetrieverService(VectorStoreService vectorStoreService) {
        this.vectorStoreService = vectorStoreService;
    }

    /**
     * 检索与查询相关的文档
     * @param query 查询文本
     * @param topK 返回的文档数量
     * @return 相关文档列表
     */
    public List<Document> retrieve(String query, int topK) {
        return vectorStoreService.search(query, topK);
    }
}

5.6 提示词模板

构建提示词模板，指导大模型生成回答：

package com.example.rag.service;

import org.springframework.ai.chat.prompt.PromptTemplate;
import org.springframework.stereotype.Service;

import java.util.HashMap;
import java.util.List;
import java.util.Map;

@Service
public class PromptTemplateService {

    // RAG提示词模板
    private static final String RAG_PROMPT_TEMPLATE = """
    你是一个问答系统，需要根据提供的参考文档来回答用户的问题。
    请严格基于以下提供的文档内容进行回答，不要编造信息。
    如果文档中没有相关信息，请明确说明"根据提供的文档，无法回答该问题"。
    
    参考文档:
    {documents}
    
    用户问题:
    {question}
    
    回答:
    """;

    /**
     * 创建RAG提示词
     */
    public String createRagPrompt(String question, List<String> documentContents) {
        // 将文档内容合并为字符串
        String documents = String.join("\n\n", documentContents);
        
        // 填充模板参数
        Map<String, Object> params = new HashMap<>();
        params.put("documents", documents);
        params.put("question", question);
        
        PromptTemplate promptTemplate = new PromptTemplate(RAG_PROMPT_TEMPLATE, params);
        return promptTemplate.create().getContent();
    }
}

5.7 大模型调用

调用大模型生成回答：

package com.example.rag.service;

import org.springframework.ai.chat.ChatClient;
import org.springframework.ai.chat.ChatResponse;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.stereotype.Service;

@Service
public class LlmService {

    private final ChatClient chatClient;

    public LlmService(ChatClient chatClient) {
        this.chatClient = chatClient;
    }

    /**
     * 调用大模型生成回答
     */
    public String generateResponse(String prompt) {
        ChatResponse response = chatClient.call(new Prompt(prompt));
        return response.getResult().getOutput().getContent();
    }
}

5.8 RAG流水线整合

将所有组件整合为完整的RAG流水线：

package com.example.rag.service;

import org.springframework.ai.document.Document;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Service;

import java.util.List;
import java.util.stream.Collectors;

@Service
public class RagPipelineService {

    private final DocumentLoaderService documentLoaderService;
    private final DocumentSplitterService documentSplitterService;
    private final VectorStoreService vectorStoreService;
    private final RetrieverService retrieverService;
    private final PromptTemplateService promptTemplateService;
    private final LlmService llmService;
    
    @Value("${app.document.directory}")
    private String documentDirectory;

    public RagPipelineService(DocumentLoaderService documentLoaderService,
                             DocumentSplitterService documentSplitterService,
                             VectorStoreService vectorStoreService,
                             RetrieverService retrieverService,
                             PromptTemplateService promptTemplateService,
                             LlmService llmService) {
        this.documentLoaderService = documentLoaderService;
        this.documentSplitterService = documentSplitterService;
        this.vectorStoreService = vectorStoreService;
        this.retrieverService = retrieverService;
        this.promptTemplateService = promptTemplateService;
        this.llmService = llmService;
    }

    /**
     * 加载并处理文档，存储到向量数据库
     */
    public void loadAndProcessDocuments() {
        // 1. 加载文档
        List<Document> documents = documentLoaderService.loadDocuments(documentDirectory);
        System.out.println("Loaded " + documents.size() + " documents");
        
        // 2. 分割文档
        List<Document> splitDocuments = documentSplitterService.splitDocuments(documents);
        System.out.println("Split into " + splitDocuments.size() + " document chunks");
        
        // 3. 存储到向量数据库
        vectorStoreService.addDocuments(splitDocuments);
        System.out.println("Documents stored in vector database");
    }

    /**
     * 执行RAG流程，回答用户问题
     */
    public String answerQuestion(String question, int topK) {
        // 1. 检索相关文档
        List<Document> relevantDocuments = retrieverService.retrieve(question, topK);
        System.out.println("Retrieved " + relevantDocuments.size() + " relevant documents");
        
        // 2. 提取文档内容
        List<String> documentContents = relevantDocuments.stream()
                .map(Document::getContent)
                .collect(Collectors.toList());
        
        // 3. 创建提示词
        String prompt = promptTemplateService.createRagPrompt(question, documentContents);
        
        // 4. 调用大模型生成回答
        return llmService.generateResponse(prompt);
    }

    /**
     * 清除向量数据库中的所有数据
     */
    public void clearVectorStore() {
        vectorStoreService.clear();
        System.out.println("Vector store cleared");
    }
}

6. 完整示例应用

6.1 控制器

创建REST接口供外部调用：

package com.example.rag.controller;

import com.example.rag.service.RagPipelineService;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;

@RestController
@RequestMapping("/api/rag")
public class RagController {

    private final RagPipelineService ragPipelineService;

    public RagController(RagPipelineService ragPipelineService) {
        this.ragPipelineService = ragPipelineService;
    }

    /**
     * 加载并处理文档
     */
    @PostMapping("/load-documents")
    public ResponseEntity<String> loadDocuments() {
        try {
            ragPipelineService.loadAndProcessDocuments();
            return ResponseEntity.ok("Documents loaded and processed successfully");
        } catch (Exception e) {
            return ResponseEntity.internalServerError().body("Error loading documents: " + e.getMessage());
        }
    }

    /**
     * 清除向量存储
     */
    @DeleteMapping("/clear-vector-store")
    public ResponseEntity<String> clearVectorStore() {
        try {
            ragPipelineService.clearVectorStore();
            return ResponseEntity.ok("Vector store cleared successfully");
        } catch (Exception e) {
            return ResponseEntity.internalServerError().body("Error clearing vector store: " + e.getMessage());
        }
    }

    /**
     * 回答问题
     */
    @GetMapping("/answer")
    public ResponseEntity<String> answerQuestion(
            @RequestParam String question,
            @RequestParam(defaultValue = "3") int topK) {
        try {
            String answer = ragPipelineService.answerQuestion(question, topK);
            return ResponseEntity.ok(answer);
        } catch (Exception e) {
            return ResponseEntity.internalServerError().body("Error answering question: " + e.getMessage());
        }
    }
}

6.2 主应用类

package com.example.rag;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class SpringAiRagDemoApplication {

    public static void main(String[] args) {
        SpringApplication.run(SpringAiRagDemoApplication.class, args);
    }
}

6.3 使用示例

启动应用：确保Milvus已启动，并配置好OpenAI API密钥
准备文档：在项目根目录创建documents文件夹，放入需要处理的文档
加载文档：发送POST请求到http://localhost:8080/api/rag/load-documents
提问：发送GET请求到http://localhost:8080/api/rag/answer?question=你的问题

7. 优化策略

7.1 文档处理优化

根据文档类型选择合适的分割策略
调整chunk大小和重叠度(通常200-1000 tokens)
为文档添加元数据(来源、日期等)，支持过滤检索

7.2 检索优化

尝试不同的检索策略：
- 相似性检索
- 混合检索(关键词+向量)
- 分层检索(先粗筛再精筛)
调整topK参数(通常3-10)
实现检索结果重排序

7.3 提示词优化

更明确地指示模型如何使用参考文档
添加格式约束(如要点列表、结构化回答)
针对特定领域定制提示词

7.4 性能优化

实现文档处理缓存
批量处理文档
考虑使用本地嵌入模型减少API调用

8. 部署与扩展

8.1 容器化部署

创建Dockerfile：

FROM eclipse-temurin:17-jre-alpine
VOLUME /tmp
COPY target/*.jar app.jar
ENTRYPOINT ["java","-jar","/app.jar"]

8.2 水平扩展

文档处理服务可独立部署，定时更新向量库
检索和生成服务可水平扩展以应对高并发
向量数据库可配置集群模式提高可用性

8.3 监控与日志

添加Spring Boot Actuator监控应用健康状态
实现详细日志记录，便于问题排查
监控RAG系统性能指标(响应时间、准确率等)

9. 总结

本教程详细介绍了如何使用Spring AI和Java构建一个完整的RAG系统，包括：

RAG的基本概念和工作原理
核心组件的实现：文档加载、分割、嵌入、向量存储、检索和生成
完整的代码示例和使用方法
优化策略和部署扩展建议

通过这个RAG系统，你可以让大模型结合特定领域的文档知识来生成更准确、可靠的回答。根据实际需求，你可以进一步扩展功能，如支持更多文档类型、实现更复杂的检索策略或集成其他大模型服务。