在任意LLM模型中实现function calling

OpenAI之前给自己的新模型提供了function calling的能力具体的内容可查看官方的Doc，暨在给LLM提供tools工具的情况下，他能根据与你的对话内容自主选择函数。

OpenAI的Function calling

tools=[
    {
      "type": "function",
      "function": {
        "name": "get_current_temperature",
        "description": "Get the current temperature for a specific location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city and state, e.g., San Francisco, CA"
            },
            "unit": {
              "type": "string",
              "enum": ["Celsius", "Fahrenheit"],
              "description": "The temperature unit to use. Infer this from the user's location."
            }
          },
          "required": ["location", "unit"]
        }
      }
    }
  ]

OpenAI的接口中出现了一个新的参数tools，将可以让LLM调用的的函数注册到这个参数中。

{
  "id": "run_qJL1kI9xxWlfE0z1yfL0fGg9",
  ...
  "status": "requires_action",
  "required_action": {
    "submit_tool_outputs": {
      "tool_calls": [
        {
          "id": "call_FthC9qRpsL5kBpwwyw6c7j4k",
          "function": {
            "arguments": "{"location": "San Francisco, CA"}",
            "name": "get_rain_probability"
          },
          "type": "function"
        }
    ...
    "type": "submit_tool_outputs"
  }
}

貌似如果入参携带了tools参数，那么就一定会返回如上调用结构，而不是正常的Chat模式，再根据返回的数据调用函数获得函数结果后再喂给LLM，LLM再结合函数结果和用户content回复用户。

整个过程的流程图如下：

说的那么好，那么哪里才能搞到OpenAI的API呢。

虽然官方的路子可能不太好搞，但是野路子多的是，不在这次的讨论范围内。现在要考虑如何不使用OpenAI接口的情况下实现function calling。目前市面上各种开源不开源的各种LLM都有各自的特色，况且百度的基础LLM都免费可以用了。

实现通用的Function calling

本次的目标就是期望能够在不需要OpenAI API的情况下,任意的LLM后端，如LLama.cpp、Ollama或者其他的后端都能实现Function calling功能，只需要Text Generate就能使用这个功能。

Function calling原理

在刚开始看到function calling的时候确实是以为又出现了什么新的模型，但这个功能确实用处很大，在比起单纯的文本生成或者后面新出的RAG来说信息的准确性和其他系统的交能力肯定有质的提升。

于是某一天在搜索Ollama有没有function call的时候在langchain中发现了一点思路 ollama_functions.ts

神奇的赛博咒语

Ollama官方并没有function calling的功能，但是langchain却实现了，原理就是使用了一段prompt

You have access to the following tools:
{tools}
You must always select one of the above tools and respond with only a JSON object matching the following schema:
{{
  "tool": <name of the selected tool>,
  "tool_input": <parameters for the selected tool, matching the tool's JSON schema>
}}

为了兼容OpenAI function的结构也是一样的

protected defaultResponseFunction = {
    name: "__conversational_response",
    description:
      "Respond conversationally if no other tools should be called for a given query.",
    parameters: {
      type: "object",
      properties: {
        response: {
          type: "string",
          description: "Conversational response to the user.",
        },
      },
      required: ["response"],
    },
};

看到这醍醐灌顶，这解决方法真是简单粗暴，不需要新的模型，也不需要fine tune，只需要设置这段system prompt只要LLM的逻辑能力够强就能直接让他根据你提供的内容输出指定的json格式，然后解析json并处理函数就可以了。

灵活性甚至比OpenAI的还要高，因为OpenAI一旦启用function calling功能就只能输出submit_tool_outputs无法正常回答问题，需要将运行结果和content再次输入模型获得新的结果，也就是说即使用户提问与任何tool都无关，还是需要执行整套流程，如上面流程图所画。

只要小小的修改一下prompt，就能在用户的内容与tools无关的情况下直接返回信息，而不需要再次输入模型。这样就能既节省时间又省钱。

You have access to the following tools:
{tools}
You can select one of the above tools or just response user's content and respond with only a JSON object matching the following schema:
{{
  "tool": <name of the selected tool>,
  "tool_input": <parameters for the selected tool, matching the tool's JSON schema>,
  "message": <direct response users content>
}}

这样的prompt就能在用户提问与tools都不相关的情况直接回答用户问题。

如果LLM逻辑能力够强是能直接返回json数据，我们只需要解析json并执行函数就可以了。

{
    "tool":"tool",
    "tool_input":{
        "arg1":"arg1",
        "arg2":false
    },
    "message":"LLM message"
}

执行函数也不用详细说了，只需要类似如下的函数表来执行函数，动态语言的话是很容易做到的。

def getWeather(arg):
    pass
def getTemperature(arg):
    pass
functions = {
    "getWeather":getWeather,
    "getTemperature":getTemperature
}

可行性测试

我自己写了一个Demo，在本地使用llama3、百度免费的ERNIE Speed和阿里通义都能理解prompt并返回正确的json。

其他的LLM没有试过，毕竟其他的在线大模型都还是收费的。

但是均有一点小问题：

llama3不知道为什么偶尔会将函数名内添加空格，导致执行函数失败，但是问题不大执行前去空格就行。
百度ERNIE Speed返回的内容不只是json，他返回的类似好的，这是您需要的json格式内容:<json内容>，所以需要先提取json再处理。

在任意LLM模型中实现function calling

阅读此文章之前，你可能需要首先阅读以下的文章才能更好的理解上下文。

在任意LLM模型中实现function calling

OpenAI的Function calling

实现通用的Function calling

Function calling原理

神奇的赛博咒语

可行性测试