大语言模型客户端的工厂模式实践_老汪软件技巧-棋牌游戏开发

作者：老汪软件技巧
发表时间：2024-09-11 21:01
浏览量：

一、前言

在使用大语言模型（LLM）的过程中，为了提升代码的可维护性、扩展性和复用性，我们常常需要对不同的LLM客户端进行封装，并采用设计模式来统一管理它们的实例化流程。这篇文章将重点讲解如何使用工厂模式封装LLM客户端，以便于支持不同的LLM服务提供商，如OpenAI、Deepseek、讯飞星火等。

这里将通过具体的代码示例，逐步展示如何实现这样的封装，包括基础客户端类、OpenAI的具体实现，以及工厂模式的应用。

二、llm 客户端封装基础客户端类设计

首先定义一个基础的客户端类 BaseLLMClient，该类主要负责定义公共接口，并提供一些基础功能如消息管理和系统角色内容的设置。

#!/usr/bin/python3
# -*- coding: utf-8 -*-
# @Author: Hui
# @File: base.py
# @Desc: { llm base client }
# @Date: 2024/08/24 12:14
import copy
from src.llm.config import BaseLLMConfig
class BaseLLMClient:
    def __init__(self, llm_config: BaseLLMConfig, **kwargs):
        self.llm_config = llm_config
        self.memory_store = None
        self.messages = list()
        self.default_system_role_content = {"role": "system", "content": "You are a helpful assistant"}
        self.system_role_content = copy.deepcopy(self.default_system_role_content)
        self.kwargs = kwargs
    
    def setup_system_content(self, system_content: str):
        self.system_role_content["content"] = system_content
        return self.system_role_content
    
    def ask(self, query: str, stream: bool = False, temperature: float = None, **kwargs):
        raise NotImplementedError
    async def aask(self, query: str, stream: bool = False, temperature: float = None, **kwargs):
        # return await AsyncUtil.async_run(self.ask, query, stream, temperature, **kwargs)
        return await AsyncUtil.SyncToAsync(self.ask)(query, stream, temperature, **kwargs)

这个类中提供了一个基础的ask接口（同步和异步两种方式），但具体实现由子类来完成。同时，它还提供了系统角色内容的设定功能。

关键点：

注意： aask 异步方法的实现是通过 AsyncUtil 工具类来将 ask 方法转成异步方法，在同异步混用的时候非常方便，详情可以查阅：/HuiDBK/py-t…

OpenAI 客户端实现

接下来，我们以 OpenAI 的接口为例，展示如何继承基础类并实现具体功能：

#!/usr/bin/python3
# -*- coding: utf-8 -*-
# @Author: Hui
# @File: client.py
# @Desc: { llm client }
# @Date: 2024/07/24 20:45
from openai import OpenAI
from src.llm.base import BaseLLMClient
from src.llm.config import OpenAIConfig
class OpenAIClient(BaseLLMClient):
    def __init__(self, llm_config: OpenAIConfig, **kwargs):
        super().__init__(llm_config=llm_config, kwargs=kwargs)
        self.llm_config = llm_config
        self.llm_client = OpenAI(
            api_key=llm_config.api_key,
            base_url=llm_config.base_url,
            **kwargs,
        )
    def setup_local_memory_store(self):
        self.memory_store = "local-memory"
        return self.memory_store
    def get_messages(self, query):
        user_message = {"role": "user", "content": query}
        if not self.memory_store:
            return [
                self.system_role_content,
                user_message,
            ]
        # use memory
        if self.messages:
            self.messages[0] = self.system_role_content
        else:
            self.messages.append(self.system_role_content)
        self.messages.append(user_message)
        return self.messages
    def clear_context(self):
        self.messages.clear()
    def _stream_maker(self, response):
        """流式处理"""
        full_content = ""
        for resp in response:
            content = resp.choices[0].delta.content
            yield content
            full_content = f"{full_content}{content}"
    
        if self.memory_store:
            # store ask context
            self.messages.append({"role": "assistant", "content": full_content})
    
    def _handle_response(self, response, stream: bool = False):
        if stream:
            # 流式处理
            return self._stream_maker(response)
    
        resp_message = response.choices[0].message
        if self.memory_store:
            # store ask context
            self.messages.append(resp_message)
    
        return resp_message.content
    
    def ask(self, query: str, stream: bool = False, temperature: float = None, response_format=None, **kwargs):
        messages = self.get_messages(query)
        response = self.llm_client.chat.completions.create(
            model=self.llm_config.llm_model.value,
            messages=messages,
            stream=stream,
            response_format=response_format,
            temperature=temperature,
            **kwargs,
        )
        resp_content = self._handle_response(response, stream)
        return resp_content

这里的 OpenAIClient 继承了 BaseLLMClient，并具体实现了 ask 方法。通过 self.llm_client 调用 OpenAI 的 API 完成请求的处理，并返回生成的内容。

关键点：

LLM配置类

配置类用于管理与LLM服务相关的API密钥和模型等必要信息。这使得客户端的代码与配置细节解耦，便于管理和维护。

#!/usr/bin/python3
# -*- coding: utf-8 -*-
# @Author: Hui
# @File: config.py
# @Desc: { llm config }
# @Date: 2024/08/24 12:45
from typing import Optional
from pydantic import BaseModel, Field
from src.llm.schemas import OpenAIModel
class BaseLLMConfig(BaseModel):
    api_key: str
    base_url: Optional[str]
class OpenAIConfig(BaseLLMConfig):
    llm_model: OpenAIModel = Field(description="OpenAI model type")

工厂模式

模型工作__factoryio模型开发

为了进一步简化不同LLM客户端的实例化过程，我们使用工厂模式来动态构建不同的LLM客户端。通过这样的设计，用户只需要传入特定的LLM类型和配置，工厂类即可返回对应的客户端实例

#!/usr/bin/python3
# -*- coding: utf-8 -*-
# @Author: Hui
# @File: factory.py
# @Desc: { llm factory }
# @Date: 2024/08/24 12:27
from typing import Type, TypeVar
from src.llm.base import BaseLLMClient
from src.llm.client import OpenAIClient
from src.llm.config import BaseLLMConfig
from src.llm.schemas import LLMType
T_BaseLLMClient = TypeVar("T_BaseLLMClient", bound=BaseLLMClient)
class LLMFactory:
    LLM_CLIENT_MAPPING: dict[LLMType, Type[BaseLLMClient]] = {
        LLMType.OPENAI: OpenAIClient,
        LLMType.DEEPSEEK: OpenAIClient,
    }
    @classmethod
    def build(cls, llm_type: LLMType, llm_config: BaseLLMConfig, **kwargs) -> T_BaseLLMClient:
        if llm_type not in cls.LLM_CLIENT_MAPPING:
            raise ValueError(f"unsupported LLM type {llm_type}")
        llm_client_cls = cls.LLM_CLIENT_MAPPING.get(llm_type)
        return llm_client_cls(llm_config=llm_config, **kwargs)

通过这种映射表结构，工厂类可以轻松扩展以支持新的LLM类型。

关键点：

使用案例

OpenAI的客户端支持 gpt类的大语言模型，例如 gpt-4o gpt-3.5-turbo... 以及兼容 deepseek

Deepseek 介绍：

这里就申请 Deepseek ，用来进行演示使用，有免费的500w token 的额度

async def main():
    llm_config = OpenAIConfig(
        api_key=llm_setting.deepseek_api_key,
        base_url=llm_setting.deepseek_base_url,
        llm_model=OpenAIModel.DEEPSEEK_CODER,
    )
    llm_client: OpenAIClient = LLMFactory.build(llm_type=LLMType.OPENAI, llm_config=llm_config)
    
    query = "总结这篇文章 https://juejin.cn/post/7283532551473725497"
    print("query:", query)
    resp = llm_client.aask(query)
    print("reps", resp)
    
    query = "如何与女生相处"
    print("query:", query)
    resp = await llm_client.aask(query, stream=True)
    for chunk in resp:
        print(chunk, end="")
if __name__ == "__main__":
    asyncio.run(main())

执行结果如下

裸大语言模型是不支持直接读取链接内容的，它在答非所问，乱回答。后面加上浏览器引擎就可以加强llm了。

三、结语

通过封装LLM客户端和使用工厂模式，我们可以轻松管理多个不同的LLM接口，并通过配置类和工厂模式的组合，极大简化了LLM调用的复杂度。同时，这种设计还便于未来扩展到新的LLM服务。

对于更复杂的需求，如动态模型选择、上下文记忆管理等，这种封装方式也能提供一个良好的基础。

上一条查看详情 +数据库表字段为何默认为 NOT NULL？

下一条查看详情 +曝字节 AI 硬件团队首款自研产品为智能耳机，与豆包联动；OpenAI 神秘新模型或将在两周内发布丨 RTE 开发者日报