Context management

Context management is very important for large models. It allows the model to provide more accurate responses based on chat history. Tencent Real-Time Communication (TRTC) AI offers basic context management capabilities and also supports developers in creating their own rich context management solutions.
Basic Context Management:
TRTC AI provides basic context management features. In the LLMConfig parameters, we introduce a History parameter to control context management:
History:
It is used to set the LLM's context rounds, with a default value of 0 (no context management is provided).
Maximum value: 50 (context management is provided for the most recent 50 rounds).
A relevant configuration example is shown below:
"LLMConfig": {
          "LLMType": "openai",  
          "Model":"gpt-4o",
          "APIKey":"api-key",
          "APIUrl":"https://api.openai.com/chat/completions",
          "Streaming": true,
          "SystemPrompt": "You are a personal assistant",
          "Timeout": 3.0,
          "History": 5    // Up to 50 rounds of conversations are supported, with a default value of 0.
  }
Custom Context Management:
The TRTC AI conversation service supports standard OpenAI specifications, allowing developers to implement customized context management in their own business. The implementation process is as follows:
﻿
﻿
﻿
This flowchart shows the basic steps for custom context management. Developers can adjust and optimize this process according to their specific needs.
Implementation Example
Developers can implement an OpenAI API-compatible large model interface at their own business backend and send large model requests encapsulated with context logic to third-party large models. Here is a simplified sample code:
import time
from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
from typing import List, Optional
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_openai import ChatOpenAI
﻿
﻿
app = FastAPI(debug=True)
﻿
# Add CORS middleware.
app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)
﻿
﻿
class Message(BaseModel):
    role: str
    content: str
﻿
﻿
class ChatRequest(BaseModel):
    model: str
    messages: List[Message]
    temperature: Optional[float] = 0.7
﻿
﻿
class ChatResponse(BaseModel):
    id: str
    object: str
    created: int
    model: str
    choices: List[dict]
    usage: dict
﻿
﻿
@app.post("/v1/chat/completions")
async def chat_completions(request: ChatRequest):
    try:
        # Convert the request message to the LangChain message format.
        langchain_messages = []
        for msg in request.messages:
            if msg.role == "system":
                langchain_messages.append(SystemMessage(content=msg.content))
            elif msg.role == "user":
                langchain_messages.append(HumanMessage(content=msg.content))
﻿
﻿
        # Add more histories.
﻿
        # Use LangChain's ChatOpenAI model.
        chat = ChatOpenAI(temperature=request.temperature,
                          model_name=request.model)
        response = chat(langchain_messages)
        print(response)
﻿
        # Construct a response that conforms to the OpenAI API format.
        return ChatResponse(
            id="chatcmpl-" + "".join([str(ord(c))
                                     for c in response.content[:8]]),
            object="chat.completion",
            created=int(time.time()),
            model=request.model,
            choices=[{
                "index": 0,
                "message": {
                    "role": "assistant",
                    "content": response.content
                },
                "finish_reason": "stop"
            }],
            usage={
                "prompt_tokens": -1,  # LangChain does not provide this information, so we use a placeholder value.
                "completion_tokens": -1,
                "total_tokens": -1
            }
        )
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))
﻿
if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)
﻿Click to view an example.
﻿
﻿